Wednesday, May 29, 2013

Tetris pieces and you

Over the span of my life I have played a lot of Tetris.  I actually just sat down and tried to figure out when I might have reasonably played my first game of Tetris, and let's just say it was a very long time ago.

I'm not sure I even want to start coming up with a reasonable estimate for the number of hours I've played Tetris, because it's not likely to be best measured in hours (or even days).  I'm sure it would be even worse if we put it into the metric of "if all the [thing] I had done in my life was my full time job, how long would that job last?"  Let's just say I'm sure I've knocked out at least a few 40-hour work weeks.

We're not here to evaluate my life choices, though, we're here to talk about Tetris.  Before we continue too far I should also say I'm somewhat of a (Nintendo) Tetris purist.  I'm not going to say that we need to go back to something that will run on a Soviet DVK-2,


But I will say that there's really no need to go much further than a game that was for all intents perfectly executed to what it needed to be.  To put it in the words of Spock in Wrath of Khan -while lamenting the fact that Kirk allowed himself to be promoted to Admiral out of the role of Captain - "Commanding a starship is your first, best destiny.  Anything else is a waste of material."

Tetris is meant to be Tetris.  This is Tetris:


Anything else is a waste of material.  Anything else is like bedazzling the Mona Lisa.

Side note, I just thought that up but couldn't help but google it.  And...someone has actually bedazzled the Mona Lisa.

Anyway, we can talk about Tetris, but if you want a version where you can press a button to quick drop (in Tetris vernacular "hard drop"), or you want to be able to rotate pieces when they're against the wall and don't have room ("wall kick"), or you want to see where the piece would be if it were to keep falling where it is ("ghost piece"), or you want to be able to spin pieces to slow their descent ("easy spin" or "infinite spin"), or you want to be able to hold a piece and/or swap a piece into some sort of reserve ("cheating"), well, we're playing different games.

And you're playing the one designed for toddlers.  Baby's first Tetris, perhaps.

Sorry, am I being too hard here?

No.  No, I don't think so.  But let's get to the actual point here.

You can see in the above image that Tetris is happy to give you some numbers on what you're doing.  It's not really the most efficient to look at these numbers during a game and plan for pieces that might be coming, but it's interesting to look at after the fact.  "Oh, that's why I lost - I didn't get enough square pieces" (said no one ever).

Except we're really pushing for some Tetris-speak today, so instead of square pieces we'll call them O pieces.  Yep, that's their letter-association name.  From top to bottom in the above picture we have T, J, Z, O, S, L, I pieces.

Let's make it a little easier:



Okay, so now we're all on the same page.

If you have said anything at the end of the game while looking at the stats, it was probably something relating to your lack (or perceived but unsubstantiated) lack of straight "I" pieces.

Everything I've read would tend to indicate that the distribution of Tetris pieces (I'm sorry, "tetrominos") to be random (if not randomly deterministic).

By the way, don't mistake that sentence as simple - those are two starts to some pretty deep wiki-spirals.

If the pieces are random, and you play forever (can you play Tetris forever?), you should get roughly a uniform distribution across the types of pieces.  How uniform does uniform really have to be to be considered uniform, though?

I've been playing and recording a game or two here and there, but mostly recording the stats from having others play.  All this has produced the following RAW DATA:



From this we can also calculate the proportions of pieces for each game.



This shows the relative distributions for each game.  Other than one game with way too many S pieces and one game with way too few J pieces, things do seem to be in a fairly tight range.  We can collapse this down to just means to get a feel if this bit of noise cancels out.


We can see that things look fairly uniform, though it's hard to tell exactly what the expected levels are.  Changing the scaling to bias lets us see what percentage of pieces we can expect above or below the expected uniform distribution per game:


The bias is quite small, and even with these fairly large numbers of pieces (>6700) a chi-square test of association fails to show a deviation from a random distribution.  [ x2 (6) = 9.260, p = 0.16 ]

Oddly, the most common piece in this data seems to be the piece we're always looking for (the "I"), but this does seem to just be a blip in otherwise random piece distribution.  I guess there's not really any reason to complain, at least overall, unless I want to complain about having too many of the best piece (and for no reason).

Anyway, why are you still reading this and not playing Tetris?

Wednesday, May 22, 2013

Rat Race, differential odds, and a practical application of binary math

I seem to have a lot of favorite games on The Price is right, and Rat Race is definitely among them.



I often don't like games where perfectly skilled play still presents only a chance of winning (like 1/2 Off), but there's just something about Rat Race.  Maybe it's the fact that perfectly skilled play always guarantees you at least the smallest prize, or maybe it's the thought of this guy in the back of the studio diligently applying neon coats to stock rat racers.



For those of you who don't know the game, it's fairly simple, but also fairly difficult.

Contestants are shown a series of three prizes, escalating in price.  The first prize is under $10, the second is under $100, and the third is under $500.  The contestant must guess the price of these items within a certain tolerance in order to earn bets on rats in the eventual race.

The tolerance is tied (non-linearly) to the level of the prize that they are guessing.  The first prize must be guessed within $1, the second within $10, and the third within $100.

If the first prize is - for example - $7, then the contestant has to guess between $6 and $8.  Since the error bars extend on either side of the price (it's not closest without going over like contestants' row or the showcase), each tolerance is actually only half of the window that the contestant actually gets to cover in terms of range of price.

This also means that completely random guessing (if the prices were completely random, which they're almost certainly not) would give you a 20% chance of guessing the first item, a 20% chance of guessing the second item, and a 40% chance of guessing the third item.  Realistically, a guess of $350 on the third prize covers a huge part of the likely-to-be-used part of the scale ($250-450), but that's a question for a different post.

Guessing these prizes within tolerance is at least somewhat skill based, not withstanding that there might be a slight bias toward random luck if played in a smart fashion.  What happens when you have the rats is where skill departs and luck takes over.

You can see that a flawless run of this game leaves you with three rats in the second half of the game.  That second half is the race itself.

Five rats of different colors (yellow, pink, orange, green, blue) race on an S-shaped track (it's actually $-shaped) that gives them each an equal distance to cover.  Not unlike horse racing, the contestant is trying to pick both the rat that wins as well as those that place.

The game is often played for a car, or something else fairly large.  This is won if the contestant selects as one of their rats the rat that finishes in first place.

Following this are two lesser prizes, one medium prize if the contestant selects the second place rat and one small prize if they select the third place rat.

Like I said, if played correctly (i.e. you guess each prize right and end up with three rats) you will always win something.  Even if you pick the three worst rats you'll still have the 3rd/4th/5th set and win the small prize associated with 3rd place.  How likely is that to happen, though?

Well, we can figure out some odds, but they are dependent on how well the contestant does on the first half of the game.  Let's start with the simplest case - the contestant doesn't guess any of the prizes correctly.

In that case, I bet they get to at least watch the race, but they have no chance of winning.  The outcome is simple:

Zero rats:
100% chance of no prize

The next step up isn't much harder - if the contestant guesses one prize correct and gets to select one rat.

One rat:
20% chance of large prize
20% chance of medium prize
20% chance of small prize
40% chance of no prize

Even if the contestant only gets one prize right, the odds are still in their favor - there's a 60% chance of winning something.

Now, we could keep working this out by hand, but there's a much more fun way to do it.  That way is through using binary math.

If you're not familiar with binary, you may never have gotten the joke that there are 10 types of people in the world, those who understand binary and those who don't.  Worry no more - you'll know all you need by the end of this post.

The number system that most of us are most familiar with is base 10.  In base 10 our numbers are all responsible for conveying 10 pieces of information before that information is passed up to a higher digit.

For instance, we can create 10 numbers with a single digit.  Those numbers are:

0
1
2
3
4
5
6
7
8
9

When we get to 9 we've run out of single digit numbers and have to go up a digit.  We do that by adding another digit to note that we have one complete set of the first digit.  The original digit resets to 0, and the new digit becomes a 1.  We don't write out the zeros that we're not using, but if we did you'd see it perhaps a little clearer when you make the transition:

00
01
02
03
04
05
06
07
08
09
10

If you've driven a car that had an old analog odometer you might have a good feel for this resetting of a digit and movement up to a higher digit.  If you want an example you can play around with this counter here.

If you understand this aspect of base 10 math then it's a simple jump to binary.  You see, binary is base 2 math.  Instead of 10 numbers to play around with there are only two - 0 and 1.

You count in exactly the same way, it's just that each digit holds a lot less information.

Let's start with the number 0.  Well, in binary it is still just 0.

0 = 0

When you move up to the number 1 nothing else changes, either.  1 is simply 1.

1= 1

Moving up to 2 is where you have to apply the things I've just explained. You see, the character 2 doesn't exist in binary - we only have the numbers 0 and 1.  That doesn't mean that the number 2 doesn't exist in binary, it's just that we have to make it using only the characters 0 and 1.

Just like when we get to 9 in base 10 math, we are simply out of single digit characters.  Also just like in base 10 math, this is solved very simply by moving up to a higher digit and rolling over the first.

Thus,

2 = 10

Did you catch that?  And do you now get the joke?  Just as we get to 09 and have to increment the 0 to a 1 and reset the 9 back to a 0 (producing 10), we have to increment the 0 in 01 and reset the 1 in 01, resulting in 01 becoming 10.

When we get to 3 we're still good, actually, as just like the numbers 11-99 we still have room in the digits we have.  Thus,

3 = 11

When we get to 4, though, we need another digit.

4 = 100
5 = 101
6 = 110
7 = 111

At 7 we again run out of places to increment and need another digit to produce 8.

8 = 1000
9 = 1001
10 = 1010
11 = 1011
12 = 1100
13 = 1101
14 = 1110
15 = 1111

Same thing happens at the transition from 15 to 16.  In fact, the same thing will happen at the transition from any number 2^x-1 to 2^x - these are the powers of 2 (2, 4, 8, 16, 32, 64, 128, 256, 512, 1024...)

You might start to recognize these numbers, and if you do it might help you understand why people use binary math.  There are certain places - like electronics - where it is most efficient to store information by having part of a circuit in either an on or off state.  These two states map perfectly well to 0s and 1s - base 2 math.

For instance, if we have a switch we can use it to store two values - off or on, 0 or 1.  If we have two switches we can use them to store four values, off/off, off/on, on/off, on/on, or 0, 1, 2, 3.  See what we're doing there?  Think about this next time you walk into a room with a bunch of switches on the wall.

This might seem like quite an aside, and it sort of is.  But one of the simplest ways to understand a contestant's chances in the game of Rat Race are by considering the fact that the contestant gets a few rats, and each of those rats can either win (1) or lose (0).  In fact, with 5 rats (digits) we can produce 32 outcomes.

It's also super easy to put them into a nice table.



You see, there are 32 potential betting outcomes in Rat race, if you were able to bet on any number of rats from 0 to 5.  Obviously, betting on 4 or 5 rats would give you much better odds (would also get two more products on TV), and would move the game from semi-skill to full-skill in that a perfect sweep of all five prize guesses would guarantee all three outcome prizes.  

Since you can't bet on four or five rats, though, six of the above outcomes are off the table, leaving us with 26 potential events, and only one chance at winning all three prizes (with three rats).

You can see that if you don't guess a single product you have no rats (1s) to place on the board, and no chance to win anything.  Once you get one rat, you can see that that rat can be in any of the places (each of the orange lines).  

With two rats there's still one way you can walk away empty-handed (the fourth line, first yellow line), by choosing the two losers.  

And finally, as we expected, the green lines (three rats) start with your worst case scenario netting you the small prize and two losers.

Overall, then, it breaks down as:

Zero rats:
100% chance of no prize

One rat:
60% chance of one prize
40% chance of no prize

Two rats:
30% chance of two prizes
60% chance of one prize
10% chance of no prize

Three rats:
10% chance of three prizes
60% chance of two prizes
30% chance of one prize
0% chance of no prize

Overall, the advice is far from shocking, as with most TPIR games - win more rats and you have a better chance of winning overall.  But if you want to know exactly, well, there you go.   

Wednesday, May 15, 2013

How to make histograms in Excel (XP, 2007, 2010, 2011) and Google Docs without any stupid add-ons: Part II

So last week I left you with a number of random numbers and an idea of what we might be able to do with them.  That idea of what we might do was histograms, and those random numbers were these:

31
49
36
37
47
13
46
33
36
52
48
60
63
36
47
85
49
45
70
34
45
70
41
65
65
45
24
62
45
42
59
50
78
49
63
37
45
45
64
32
13
56
47
31
57
42
52
63
45
62

If you remember, these numbers should be drawn from a normal distribution with a mean of 50 and a standard deviation of 15.

Now, if you do a search of 'how to make histograms in excel' most of the responses will come up with a whole bunch of proprietary junk that builds you histograms if you buy and/or download it, with the remainder suggesting that you find your Excel CD to load a whole bunch of extra packages.  Many of these sites are trying to get some money out of you.  To be fair, I'm also trying to get some money out of you, I'm just a lot worse at it.  =)

Anyway, we're not here for that today, because we don't need that - you can make your own histograms perfectly well just with what excel gave you.  For that matter, with what Google Docs gave you (for free!).  

We're going to rely on two main concepts today.  The first is Excel's 'Frequency' function, and the second is the conceptual act of binning (not to be confused with Dr. John F. Binning).

You see, any program that will just make you a histogram all willy-nilly is making some choices for you, and those choices basically manifest in how many bars you get on the graph that is created.  For instance, this is a histogram of the above data:



The trick is that I've only created one bin - in this case for numbers from 0 to 100.  Every number was placed in the same bin because I made it far too large for the data.  Variance has been washed away completely.

This is actually a fairly important point - unless you create a bin for every number on your chart you are likely to display less variance than you actually have when you produce a standard histogram.  For the most part it's not something that anyone really worries too much (unless you create a histogram like the one I just did), but it's something to keep in the back of your mind.  If any bar contains more than one number, then those numbers are no longer being treated as distinct.

Let's start with the idea of binning.  You may have already picked up on it from the above talk about bins, but the goal here is to create a number of bins (or buckets) into which we'll sort our numbers.  You want to pick something that makes sense, covers the range of numbers, maintains equal distance!, and maintains as much variance as possible.  We'll go through each of those steps in turn, but let's start by just making a pretty straightforward set of bins: sets of 10 from 1 to 100.

Open up your spreadsheet program of choice - I'm going to start by running through Excel 2010 but a lot of things are similar no matter what we do.  The main difference turns out to be the keyboard shortcuts between Windows and Mac (not surprisingly).  That said, it turns out that the things that work in Excel 2010 apply to Excel 2007 and Excel XP, and the things that work on Excel 2011 presumably work on whatever the last iteration Macs had.

We need the random numbers in one column, so go ahead and copy paste them in there - or better yet create your own.  Some of you might also just have some data you want to use, so all you need to do is make sure it's in some sort of array (like a column).

In a different column we need to create bins, and for this first part we can set them as mentioned, ten sets from 1 to 100.  To do this we need a column that looks like this:

10
20
30
40
50
60
70
80
90
100

The bins in excel are defined by the distance between the prior number and that number, so the first bin contains all numbers 10 and below, and the second bin contains numbers 11 through 20.

Now for the tricky part.

You have a column of numbers, and you have a column defining your bins.  Now it's time to use the frequency function.

If you just go to any cell in your spreadsheet and type' =frequency(' you're going to get a little pop-up with some helpful notes on what you need to include in this formula.  In Excel it is going to prompt that you want a 'data_array' and a 'bins_array'.

An array is simply a systematic arrangement of objects, in the case of Excel a arrangement of objects in a column or row.  So, we know what we need - in my case I placed the random numbers in the first column starting at the top, so my data array is A1:A50.  If you placed your bins in the second column starting from the top your bins array would be B1:B10.  Your arrays may vary.

Don't go making your formula so fast, though.  If you're using Excel 2010 (or 2007 or XP) you need to do something else first.  You need to link a bunch of cells to this same formula.  This is done with CTRL+SHIFT+ENTER when you have all the destination cells selected.

Before you type out your frequency formula, select the 10 cells just next to the bins you created (or however many cells for however many bins - it's 10 for this example because there are 10 bins).  In my case this would be cells C1:C10.

Once those are selected, type out the frequency formula - for me this looks like '=frequency(A1:A50,B1:B10)'.  Instead of hitting just ENTER when you finish, though, hit CTRL+SHIFT+ENTER.

If you've done it right, it should have filled out each of the selected cells with counts that are in the bins next to those cells.  All you need to do now is make a chart in the normal fashion using the chart builder and you'll come up with something like this:



If you're using a Mac you might be using Excel 2011.  In this case the things we just did very likely did not work on your computer.  The steps are exactly the same, except for the whole CTRL+SHIFT+ENTER part.

You still need to have a data array and a bins array, and you can type out your frequency formula in the first cell of a new column just like on Windows (except you don't need to have all the destination cells selected when you do).  After you have that cell, however, press enter to get a value in it.  Then, select that cell and all others in the final array you're creating (those cells next to the bins).

Once you have the correct frequency formula in the first cell, and those cells selected, press CONTROL+U.   This should highlight a bunch of cells.  Then press COMMAND+SHIFT+ENTER, which should fill in the cells you're looking for.

I mentioned Google Docs, and the same technique should work there - perhaps depending on your operating system.  There's an interesting quirk in that Google Docs goes one more cell beyond what you select, and that cell count is everything above the final number.  It's hard to explain, but if you test it out with some data you should figure it out fairly quickly.

Google Docs also doesn't require any of this multi-key pressing either, as if you simply start a frequency in a cell based on your two arrays it will fill out cells below that until it runs out of things in your bin array.  It actually takes out an entire step of odd keyboard shortcuts that means it probably functions the same on both Windows and Mac (it seemed to work the same for me on both).

And best of all, you didn't need any fancy extra software.

So go make some histograms!

Wednesday, May 8, 2013

How to make random numbers and histograms in Excel (XP, 2007, 2010, 2011) and Google Docs: Part I

If you've been reading the blog for a while you know that I've complained a few times about both Google Docs and Microsoft Excel and their failure to easily convert data into histograms.  They can make bar charts out of data from categorized tables, but can't just take a raw data array and easily just convert it to a graphical representation of frequency (a histogram).

Now, there are ways to take raw data arrays and convert them into categorized tables, and we're going to talk about that in a bit (next week).  First, though, we should answer the easily sarcastic question: why is this so hard to do this?

Well, it is, and it isn't.  There are plenty of statistical programs that will let you create a histogram fairly easily, but in doing so it's very easy to forget about some of the underlying information used to create that histogram.

Let's make things concrete and start with some numbers, shall we?

I could just make up a string of numbers, but they we wouldn't be learning anything from it (except how bad I am at making up numbers).  Instead, let's use the very programs with which we're looking to make histograms to make some random numbers.

Both Excel and Google Docs have some pretty decent random number generation, depending on what you actually consider random.  The heart of this is the function:

= rand()

This command will return a random number between 0 and 1.  If you're looking for a uniform random number this will take care of it.

Oh, you wanted a random number between some other range?  Say 0 to 50?  Well, then take your 0 to 1 random number and multiply it by 50.  You wanted it between 1 and 50?  Multiply it by 49 and then add 1.

You wanted it between -37 and 224?  First off, why?  Second, multiply your random number by 261 and then subtract 37.  DONE.

You want it between .4 and .6?  Feel free to take a stab at that one in the comments - I've given you enough to figure it out.

Think of it this way.  You and a friend are in a large room.  The floor has a long line from one end of the room to the other with 0 at the center of the room.  Marks are painted out on the line at each foot to mark out the (relatively low) positive and negative numbers.

Laying on the ground, with the tips of his arms resting neatly at 0 and at 1, is a mint condition Stretch Armstrong.



Stretch, in this example, illustrates what the rand() command has given you - a random number pulled from the range of 0 to 1.

You're pretty confident that you and your friend could each grab an arm and pull Stretch to either end of the room, and that's exactly what you're doing when you multiply your rand() output by any given number.

The first example of wanting a number from 0 to 50 may oversimplify things a bit.  You're multiplying by 50 because you want a range of values that covers the numbers from 0 to 50.  Things are easier when you want to start with 0, as it's always going to be the bottom value when you finish this multiplication step.

If you want a range that doesn't have 0 as the lower bound, then you need to shift that range one way or another.  Only after you multiply - if necessary - though.  It's why you multiple by 49 if you want the range to be from 1 to 50 instead of 0 to 50 - you have to start with a range that extends to 0 due to the fact that 0 multiplied by any number is still 0.  After multiplication you can simply increment in either direction.

This is accomplished by taking your stretched out Stretch Armstrong and walking up and down the number line - the range that Stretch's arms cover is the range from which your random number will be pulled.  If you stretch Stretch to 20 feet long and then walk him 10 feet to the left you'll have a random number centered on 0 within the values of -10 to 10 (ish).   

I should note that both Excel and Google Docs have functions that allow you to specify a ceiling or floor for random numbers, but if you understand how rand() works there's really no need for it.  It's a completely redundant function, and you should feel angry that it's there.

We're looking to make a distribution to plot out on a histogram.  What rand() gives us is a uniform distribution, which makes for boring graphics.  How about something flashy, like a normal distribution?

Well, Excel and Google Docs don't have random normal commands (there are many programs that do), so we have to make use of some other functions to transform our rand() values into something a bit more...well, normal.

In Google Docs this function is:

=NORMINV(number, mean, standard deviation)
And in Excel it is:

=NORM.INV(number, mean, standard deviation)

Google actually sums it up pretty well in the description of the function:

"Returns the inverse of the normal distribution for the given Number in the distribution. Mean is the mean value in the normal distribution. STDEV is the standard deviation of the normal distribution."


Thus, if we use the form:

=NORMINV(rand(), 50, 15)

We'll end up with random numbers drawn from a normal distribution with a mean of 50 and a standard deviation of 15.  If we pulled 50 such numbers they might look something like this:

31
49
36
37
47
13
46
33
36
52
48
60
63
36
47
85
49
45
70
34
45
70
41
65
65
45
24
62
45
42
59
50
78
49
63
37
45
45
64
32
13
56
47
31
57
42
52
63
45
62

And that's where we'll pick it up next week!

Wednesday, May 1, 2013

Pistachios OR One simple method to have a pretty weird grocery shopping experience



So for a long time I had been under the impression that I didn't really like pistachios.  Once - years ago - I had a bunch and then didn't feel great later on that evening.  The human mind has some great methods to create strong linkages between foods and feelin' bad, as back in a hunter gatherer stage such linkages would have been exceptionally useful.

In any case, I didn't really think much of it and just didn't really eat pistachios again.  It doesn't seem like I really came across them that much in day to day life, so it wasn't something that I really had to avoid.  

Recently, I was hanging out with some friends and they put out some pistachios, and I figured I'd give them a try.  Long story short, they're delicious.  Take that, hunter gatherer part of my brain.  

This was a few months ago, and I finally saw some bulk pistachios at the store last week.  I bought some, and realized that they're one of the few nuts that you can't buy without the shell (there are actually some good reasons for it, apparently).  You can certainly buy other nuts with their shells, but few of them really require it (with the exception of some of the more novel nuts, like hazelnuts).

Examining what I got from the store led to the conclusion that in any given scoop of pistachios you're getting mostly whole pistachios.  Aside from this, you're also getting some nuts that have come out of the shell and some shells that don't have nuts in them.  

I was curious about this, initially, and a cursory pass does seem to suggest this is noise that basically cancels out - you get about the same number of empty shells as you get shell-free nuts in any given pull.  In the aggregate, then, you're still basically getting whole nuts.

Continuing to think about this a little bit - while sitting around eating pistachios - a thought crossed my mind.  

What if you were really careful at the bulk bin?  Instead of taking indiscriminate scoops, why not try to be a bit more calculating?  If you could aim for already shelled nuts and avoid nut-free shells, you might save a little change (which you could use to buy more pistachios, duh).  

Now, the short answer to the longer question is that you don't do this because it would look really weird if you sat at the bulk bin picking out shells and shell-less nuts.  People would probably start to stare.  The extension would also be that at a certain point you should stop searching and just start shelling - take the good stuff to the scale and leave the shells behind.  

But how much is this really going to impact your total?  How much does a pistachio nut weigh in relation to a pistachio shell?

Easy enough - I have a kitchen scale for just these sorts of questions.  

The weight of one whole pistachio is...0 ounces.  

The weight of one pistachio shell is...presumably less than that.  

You see, my kitchen scale doesn't have the resolution to specify things at the level of a single pistachio, let alone a single shell.  This is not a problem only unique to my kitchen scale (or to pistachios).  

How to fix it?  Well, averages based on larger samples.  

10 pistachios give a measure on my scale, though the resolution is still not there to pick up small differences. In fact, 10 pistachios generally fluctuates between 1/8 and 1/4 of an ounce, meaning that the actual average weight it likely somewhere between those numbers.  My kitchen scale does not give values between those two numbers. 

Where between does that actual number fall?  Well, my scale can't tell me that, at least not exactly.  But if I take enough samples I can get a proportion that shows what amount of time the scale comes up 1/8 vs 1/4.
If we treat these as the likely bounds of the weight for 10 pistachios, then the proportion of the time the higher number comes up is the percent distance we have to travel between those two numbers.  

Put another way, we can take the average of those measures and find a point between them that is a best guess for the true weight of 10 pistachios.  

We can do the same for some sets of 20, even 40, and see if those help to give us a better picture of the scale (they do).  

At a certain point, it's simply time to eat some pistachios.  

I didn't weight the nuts themselves, as with the weight of the whole pistachio all we really need is the weight of the shell - the average weight of the nut should be what's missing.  It's also much harder to shell a pistachio and then just set the nut and shell aside, especially when you only need to set one aside to be able to figure out the weight of whatever you eat. 

The same idea of needing to use multiple shells on each measure holds, as the shells are lighter than the whole nut before shelling (obviously).  

All said and done, the weights come out as follows:

Whole nut: 0.042 ounces
Just shell: 0.019 ounces

Thus, just nut should be: 0.23 ounces

Interestingly, the weights of just the shell and just the nut are pretty close to each other.  This is a good sign - for every five empty shells you leave in the bin you should be able to take four shelled nuts away at basically an even weight trade.  

It really only makes sense to make the trade, as if you're just leaving shells to save money you'd have to do a lot to make a dent.  At $7.99 a pound, a single shell (without nut) is worth a little less than...1 cent?  What's a few cents going to buy you?

Since the nut and shell weigh about the same, it means that (at these prices) a full pistachio should run you just over 2 cents.  A pistachio nut runs just a bit over a cent.  So, the question of what a few cents will buy you is one or two whole pistachios (or roughly twice as many shelled pistachios). 

A single penny might not seem like much, but given the fact that the whole thing is only twice the price of the nut means you're looking at somewhere in the ballpark of 40-50% savings by removing the shell.  

For a single nut, this might not make much sense.  But remember, we're doing this in aggregate!  Who goes to the store and buys a single pistachio?  

The logical conclusion is that you should - just as some people do with sweet corn - stand at the store and peel back those pennies from your pistachios.  

Just don't blame me if you get kindly (or unkindly) escorted out of the store.