4/10 means to me that the game is barely playable, has huge issues and crashes every 10 minutes but still shows some potential.
That is exactly how I'd describe Skyrim, and it seems to have gotten 9/10 from most reviewers. This is with fully patched DLC that still generates consistent (6 out of 7 trials) game-breaking bugs during the main DLC story missions.
I kinda think that that kind of thing, charging extra money for stuff that doesn't work, should incur some sort of fine or jail time. Maybe I'm a little extreme. But I think we can at least agree that it should result in quite a bit less then 9/10 for review scores.
However, we do know what caused the discrepancy between critic reviews of skyrim and user reviews of skyrim. It's that the Critic only got 360 versions of Skyrim which didn't have nearly as many bugs as the other versions. The Absolutely unacceptable version of Skyrim is the PS3 version, and never should have been released. The bugs also don't immediately show up. They slowly creep in until it becomes unplayable. Since critics rarely play the a game from beginning to end they don't normally see mid to late game bugs. It would be nice if critics would indicate how many hours they actually played a game, but I'd say they play less than 20, and some much much less than that.
From what I've seen with scoring I'd say the longer you play a game the more likely you're to give it a lower score. There is a logical reason for this. Large games like Skyrim are bound to have a few untested game breaking bugs. "Open World" games tends to have this problem in spades. When you look at the metacritic scores for Critics and Users the first day of a games launch all of these reviews were done with the minimum time in the game. After that they tend to fall forcing the average down 1 out of 20 points or 1 of 10 points of what it started out at. People who review a week or two after launch have a much higher chance of encountering a game breaking bug, or just getting made because of a particular level design. Developers will also have removed most if not all of the easy to encounter game breaking bugs in the first few levels because they most likely made those levels first, and via the virtue of being the first part of the game, has been tested the most.
Then you have a games like Sonic that have the exact opposite problem. Critics are normally a full point lower than users on those titles, or they're exactly the same. I'd say the issue is actually that games like Sonic are easy to grasp and understand. If there is a game breaking bug or game mechanic it shows up early and stays the entire game. Another example would be the Disgaea series. This results in the critic who doesn't play for more that 10 hours can now get an accurate view of the games quality. However, this then shows off the selection bias of the users that is a result of the marketing. No gamer ever buys a game that the trailer made them not like the game. They only buy it if the trailer, and marketing, actually made them think it was interesting. So you end up with an over selection of users who know they they will enjoy the game, and an under selection of users who know they wouldn't like the game. For example, if you didn't like JRPG's would you buy a game that the trailer screamed JRPG to you. Your marketing can also backfire in attracting the JRPG crowd. If you get a bunch of players interested in a game that think it's in the JRPG style and it's not then they could out right hate the game much more for not meeting their expectations. A good example of this is actually a movie, The Purge. A lot of people went to it expecting it to be some new take on horror, and instead got a standard home invasion horror flick. Critics don't exactly suffer from that problem, they can, but because they might not have even encounter the marketing by the time they get the game, and are simply playing because their boss wants a review ready by a dead line, and even when they do they can't exactly choose not to review a game their boss told them to.
Here is how I see the scores.
If Critics are Lower than the Users then the game could be a niche title that whether you like it or not should be so obvious. Or the title has some issues early on, forced stealth, that angered the critics, and caused them to view the game differently then someone who continued to play past those parts.
If Critics are only 0.5 to 1 point higher than users then it's normal.
If Users are lower than critics by more than 1 point there are issues that are not readily apparent in the first few hours of play.
Then there are critics who are gaming the system. They've long since figured out that Human beings talk more frequently about negative news then they do about positive news. This is probably why humans keep constructing scales, and then use the first half of the scale to reflect the abysmally bad. This results in bad news becoming click bait when it's the only negative news there.
You can discover who is doing this by using the Law of Large Numbers.
According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed
The expected result of a critic is their average should approach the overall average. Unless something is wrong with the critic. An honest difference would be a food critic who is a Super Taster would consistently hate foods the average liked, but would be inline with the average of other Super Tasters even if you think that their opinion is "subjective". In this sense the more "subjective" the more likely the LLN applies since subjective is just a random element, and that's what the LLN is all about. Averaging mitigates the effect of a random subjective element.
For example The Escapist's profile on Metacritic actually shows the measure of this with the point higher or lower score given to critic profiles. The Escapists only gives 0.12 of 10 points higher on average on a sample of over 400 games. That is what the LLN is all about. It shows that the Escapist is in line with the expected results even when they might give a game a 9 that averaged an 8 they'll give another game a 5 that should have got a 6.
However, even though most fall in line, there are going to be some exploiting human negativity. The Edge gives about 0.88 of 10 points lower than the average with a sample over 2,400 games. That is out of line with the LLN and warrants looking into. If they have a number of reviews with simple facts being way out of sync then they would be generating click bait rather than honest reviews.