School’d is a series about the data we collect at Slader and what we’re learning from it.
Some of this data is pure novelty - fun stuff that we’ve become experts in from spending hours with our site and observing our users’ behavior.
Other learnings seem more significant - not just in terms of how we run our site, but in regards to how students today are learning, and how they’re using the Internet to support their learning.
Hundreds of thousands of students visit Slader.com each week to help them with their homework. They are here by choice, not at the urging of their parents, their schools, or their teachers, and they’re taking a proactive approach to their own learning.
What can we learn from them?
Five-star rating systems have their drawbacks. Most users will either rate something a 1 a 4, or a 5. They will either love something or hate it, but not much in between, or if they don’t feel strongly, they’ll just not rate. Users are used to the system, and it looks ‘normal’ … but is it useful to us?
Here’s a breakdown of how users rate content on Slader. A clear longtail with most users rating just a few pieces of content. But, there are still many having rated 10-30 pieces of content, and some that have rated thousands of pieces of content.
I thought it’d be telling to see how our ratings compare for different types of content and in relation to other behaviors. I’ve charted alongside each other a) all ratings, b) ratings for Calculus solutions, c) ratings for Pre-algebra solutions and d) ratings where the user also commented on the content.
The first piece of information that stands out is that if a user comments on a piece of content, they’re much more likely to rate it a 1 than otherwise. We see this often - a user will point out a mistake with a solution or comment that they’d like more clarification. It’s a good method of getting the user who submitted the content to provide some additional information or fix an error.
Also noticeable are the lower number of low scores on Pre-algebra content relative to overall ratings counts, and the high relative number of lower ratings on Calculus solutions … it’s anyone’s guess as to why. Perhaps younger students have a bit less negativity in them?
Note: I’ve taken slight liberties in the y-scale of this chart in order to be able to better compare the datasets side-by-side.
And now, for how those ratings break down by user. Because many users have rated just a few pieces of content, we see obvious spikes (which I’ve cropped) at the round numbers (users can’t rate content less than a whole number of stars). And the average ratings skew toward the higher end of the scale and are somewhat scattered otherwise. An interesting dataset, that even given the hundreds of thousands of datapoints we’re looking at doesn’t show a clear pattern. Need more data!
What this chart mostly tells us is that users rate content a 1 and a 5. If they have an average rating near 2.5, they have a lot of variance. Closer to the ends we see more consistency, they either hate or love all content.
In summary, what’s clear from a quick review of this data is that using star ratings for our content on Slader perhaps isn’t the ideal rating system. Users seem to use the 1 and 5 star buttons to rate content, and mostly ignore the rest of the buttons. We learn if they hate or love something, but more rarely do we get a nuanced assessment of the quality of the content.
And how does this data differ from elsewhere? Sites like Netflix have users who seem to rate content with a bit more keen eye. Most other systems (movies, restaurants, products) tend to see a heavy loading of high star ratings (4s and 5s), but we see more polarization. One reason is that most ratings on Slader apply to math solutions and if a user perceives that solution as ‘wrong’, then it’s a much more polarized feeling than not really liking the latest Tarantino film or the Chicken Parmesan at the local Italian joint.
Is it worth changing the system we use? Or do we make improvements to the star ratings? I tend to think that the possibility of gathering more nuanced data is better than assuming we’ll never get it …
Peter Bernheim is CTO of Slader.com. Questions? Comments? Something to add? Email me at firstname.lastname@example.org