Sign in to follow this  
Followers 0
iacas

Generating Scatter Plots

18 posts in this topic

I'm hoping that someone I tag at the bottom of this thread - or someone else - can help me out with something.

I'd like a way to generate a scatter plot on a cartesian coordinate system that obeys a bell curve but lets me set the standard deviation (the same would apply in both X and Y directions).

If it generates 100 "random" (fitting the standard deviation I set) points, that'd be perfect. I should be able to press a button to get a different set of 100 "random" dots.

The alternative would simply be a random number generator that obeys the bell curve and can kick out 100 (or 200, if I have to pair them) numbers that follow the standard deviation I set.

Can Excel or something else do this?

@jamo @boogielicious @Lihu @saevel25 @Golfingdad

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

Sign up (or log in) today! It's free (and you won't see this ad anymore)!

Sign up (or log in) today! It's free (and you won't see this ad anymore)!

http://www.random.org/gaussian-distributions/ ?

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

I hate when I research something, then when I write it up, I use a word or something that triggers a slightly different search, and then I find the answer. :P

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

Yea I don't think Excel can handle that with out some fancy excel work.

f(x,\mu ,\sigma )={\frac  {1}{\sigma {\sqrt  {2\pi }}}}e^{{-{\frac  {(x-\mu )^{2}}{2\sigma ^{2}}}}}

The only thing I could think of would be, set two cells in excel to be your standard deviation, and mean.

Then you can do something like

1 / ($A$1*SQRT(2*PI())) * EXP (- ((RANDBETWEEN(0,???) - $A$2)^2)/(2*$A$1^2)))

Were ??? Is the max random number you want. Not sure if that works out for ya. Just drag and click. Every time you edit something on the worksheet it should update the values, RANDBETWEEN is active like that. Not sure if that is correct or not, or if that is what you want.

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

Excel has the inverse CDF for a few common distributions, including the normal.  With the inverse CDF, you can just plug in a uniform random number (ie RAND()), and get a random number distributed by the PDF of the distribution you used the inverse CDF for.

So, for a random number distributed normally with mean 0 and SD 1, you can just use NORMINV(RAND(); 0; 1)

One of the other built in inverse CDFs in Excel is the log-normal.  That might be useful Erik if you want to plot non-symmetric landing spot scatters.  Most players definitely have a skewed distribution of distance (we all chunk one 2/3 of the way there but do we ever airmail the green by 33% of the distance we meant to hit it).  And many players have a skewed miss distribution.  As I've been trying to learn to draw the ball my left/right misses have gotten more equal in percentage terms, but it used to be my shot either hit the green/fairway or went right, with a long tail off to the right.  You could try using the lognormal (LOGINV) or gamma (GAMMAINV) to get skewed miss distributions (though those are both only defined on positive numbers, so you'd have to, say, subtract the mean to get misses left and right of 0).

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

Excel has the inverse CDF for a few common distributions, including the normal.  With the inverse CDF, you can just plug in a uniform random number (ie RAND()), and get a random number distributed by the PDF of the distribution you used the inverse CDF for.

So, for a random number distributed normally with mean 0 and SD 1, you can just use NORMINV(RAND(); 0; 1)

One of the other built in inverse CDFs in Excel is the log-normal.  That might be useful Erik if you want to plot non-symmetric landing spot scatters.  Most players definitely have a skewed distribution of distance (we all chunk one 2/3 of the way there but do we ever airmail the green by 33% of the distance we meant to hit it).  And many players have a skewed miss distribution.  As I've been trying to learn to draw the ball my left/right misses have gotten more equal in percentage terms, but it used to be my shot either hit the green/fairway or went right, with a long tail off to the right.  You could try using the lognormal (LOGINV) or gamma (GAMMAINV) to get skewed miss distributions (though those are both only defined on positive numbers, so you'd have to, say, subtract the mean to get misses left and right of 0).

COOL!!! Learn something new.

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

There is another factor to consider, for a right-hander, long left, short right, I'm sure you get what I mean.

0

Share this post


Link to post
Share on other sites

So, for a random number distributed normally with mean 0 and SD 1, you can just use NORMINV(RAND(); 0; 1)

One of the other built in inverse CDFs in Excel is the log-normal.  That might be useful Erik if you want to plot non-symmetric landing spot scatters.  Most players definitely have a skewed distribution of distance (we all chunk one 2/3 of the way there but do we ever airmail the green by 33% of the distance we meant to hit it).  And many players have a skewed miss distribution.  As I've been trying to learn to draw the ball my left/right misses have gotten more equal in percentage terms, but it used to be my shot either hit the green/fairway or went right, with a long tail off to the right.  You could try using the lognormal (LOGINV) or gamma (GAMMAINV) to get skewed miss distributions (though those are both only defined on positive numbers, so you'd have to, say, subtract the mean to get misses left and right of 0).

Like this?

I added a 0.2 multiplier on the right column: if you miss left, the shot will tend to be a little longer, and if you miss right, a bit shorter.

The actual standard deviation is kind of irrelevant since I can just scale the image up or down to fit anyway.

One of the other built in inverse CDFs in Excel is the log-normal.  That might be useful Erik if you want to plot non-symmetric landing spot scatters.  Most players definitely have a skewed distribution of distance (we all chunk one 2/3 of the way there but do we ever airmail the green by 33% of the distance we meant to hit it).  And many players have a skewed miss distribution.  As I've been trying to learn to draw the ball my left/right misses have gotten more equal in percentage terms, but it used to be my shot either hit the green/fairway or went right, with a long tail off to the right.  You could try using the lognormal (LOGINV) or gamma (GAMMAINV) to get skewed miss distributions (though those are both only defined on positive numbers, so you'd have to, say, subtract the mean to get misses left and right of 0).

Feel free to modify my Excel file and re-post, but… so far as I can tell we can just drag the scatter plot around a bit to get a fairly good representation. I don't want to make it too complex - chunking a shot now and then just won't factor, because you can't really "plan" for those anyway. So these are shots hit reasonably well.

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

There is another factor to consider, for a right-hander, long left, short right, I'm sure you get what I mean.

Yeah, but this gets more complicated because you can't generate independent random numbers for the X and Y dimensions.  You need to create numbers from a 2-dimensional distribution where there is correlation between the dimensions.  This is relatively easy to do in a stats language (e.g. R), but is a huge pain in the ass in excel.

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

Yeah, but this gets more complicated because you can't generate independent random numbers for the X and Y dimensions.  You need to create numbers from a 2-dimensional distribution where there is correlation between the dimensions.  This is relatively easy to do in a stats language (e.g. R), but is a huge pain in the ass in excel.

I dunno. I feel like I did it reasonably well. :)

I take the shot's value on X (left/right) and use it to create a value for Y by setting the mean to -0.2X.

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

I dunno. I feel like I did it reasonably well. :)

I take the shot's value on X (left/right) and use it to create a value for Y by setting the mean to -0.2X.

So why the 0.2 for the vertical?

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

So why the 0.2 for the vertical?

Because shots hit left (pulls) go farther. Shots hit right go shorter. It's actually -0.2 to account for this.

So a shot "10" left will have its mean set to "2" long.

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

Looks like the way I would have done it in Excel.  I used to use MathCad and software from the MathWorks for work.  It had these functions.  But that was a dozen years ago.  I haven't touched in since maybe 2002.

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

I dunno. I feel like I did it reasonably well. :)

I take the shot's value on X (left/right) and use it to create a value for Y by setting the mean to -0.2X.

Great solution for getting a reasonable looking scatter plot.  Was thinking it would be a pain in the ass to exactly generate a bivariate distribution with exactly the covariance and means you want, especially for anything non-normal, but this is obviously the right solution for getting a good looking scatter plot!  Reduction of SD for Y variable was a nice touch too, giving it more of a look of a bivariate correlated distribution.

As I noted above, one thing you could play with is skewed distributions to give, say, some of the way short or big push-slice shots common among most everyone except low-mid single digit or better players.  Attached is an edited excel sheet using two gamma distributions to get this effect.  It's more of a hack than with the normals, but you can play around with it to get it to look like you want.  The two parameters to the Gamma variable in column A are alpha and beta.  Those are the second and third args to the GAMMAINV fxn.  The mean of a gamma distribution is alpha/beta, and the standard deviation is sqrt(alpha / beta^2).  So you can play around with alpha and beta and the very ad hoc stuff I put in for alpha and beta in column B to shift how it looks if you feel like it.

https://dl.dropboxusercontent.com/u/101944576/200PointPlot_Gamma.xlsx

How do I add a non-pic, non-video attachment?

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

How do I add a non-pic, non-video attachment?

The paper clip icon.

Thanks. I'll download that and try it out. I don't know if we care too much about the weird outliers. People tend not to factor those in or believe that they hit those as often as they do anyway, so if we can make our case with a normal distribution (skewed slightly that long left/short right is in there too), then we should be okay.

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

The paper clip icon.

Thanks. I'll download that and try it out. I don't know if we care too much about the weird outliers. People tend not to factor those in or believe that they hit those as often as they do anyway, so if we can make our case with a normal distribution (skewed slightly that long left/short right is in there too), then we should be okay.

Yeah, fair enough.  If you're just trying to make the aim where you scatter plot scatters around the center of the green point, then you probably don't need to worry about the skewed distribution.  Could make an interesting newer players' add-on point where if your right-left distribution in particular is skewed with a fat tail off to the right (lots of fades and slices, a few pulls, no hooks), where you want to aim the heaviest part of the scatter plot changes depending on what's off to the right.  Something like if missing at all to the right is super penal, maybe OB is just to the right of the green, then maybe you actually want your median shot left center or even left edge of the green, accepting that a good number of decent shots will be off the green left in exchange for many fewer OB shots.

Ah.  I never actually noticed the "More" option off the right of the comment box.  I swear I'm on top of shit...

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

Thanks @mdl for the great info. My only issue is when reading your posts, I imagine the bear avatar speaking. :-\

0

Share this post


Link to post
Share on other sites
Awards, Achievements, and Accolades

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0



  • Want to join this community?

    We'd love to have you!

    Sign Up
  • 2017 TST Partners

    PING Golf
    Leupold Golf
    Snell Golf
    Talamore Golf Resort
    Lowest Score Wins
  • Posts

    • Thanks for your questions Swede. Regarding data for irons, if I understand what you are asking, the driver data can't really be correlated. Each iron would have it's own ideal launch conditions (launch angle, spin rate, ball speed) which would be based off of the player's swing speed.  The ball is designed to perform differently with longer clubs than shorter clubs, but if you can get dialed in with your driver, you'll be pretty close with the rest of the set also. The driver/shaft combo certainly affects the trajectory as well, and sometimes guys are playing the wrong ball and the wrong driver.  But what I see more often is a player who goes through the fitting process when purchasing a driver and irons, then they play whatever ball happens to be on sale.  It would be like using a different driver every time they played!  When trying to optimize trajectory, the ball is a good place to start.  Why buy a new driver when moving to a different ball can make the difference?  Sometimes the ball will help some, but to get where a player needs to be a different shaft or driver might be needed also. A lot of guys will go through a ball fitting whenever they get a new driver, which is not a bad idea.  Usually, if your previous driver fit properly and the new one fits properly, the ball will work just fine.  I usually suggest going through a ball fitting at least every-other-season just to make sure.  Sometimes our swings evolve...maybe your swing has improved or swing speed has increased, or it could be the other way, but it's good to make sure your stuff is correct.
    • To be clear, I have never talked about "the Titleist fitting." I don't know what they do to fit players. I'm simply talking about their recommendation to start at the green and work backward, but ultimately to consider all the shots you play in a round of golf, not just ones with the driver. I'm not talking about "here's two balls, try them out." I'm talking about the idea of "here are 30 kinds of golf ball. I eliminated a few because they felt horrible off my putter. I eliminated a few more for poor performance around the green. I eliminated some more for poor spin or flight with my irons. Of the six that I had left, these two performed well with my driver, so one of them is a good fit. If they have a super official "ball fitting" process, I wasn't talking about that, nor was I talking about a "here is a Pro V1 and a Pro V1x… hit some shots and pick one." So… I wish you hadn't devoted that much attention to the "Titleist method" in your post when that's not at all what I was asking. My point was… I'm skeptical that the Bridgestone method (only hitting a few balls, not doing much to account for consistent tee heights, ball position, players getting "warmed up" during the process, etc., only using the driver and disregarding the rest of the shots) is a great method, either.
    • 1. Golf is elitist. So far from true but I still get way too many people who chuckle at my interest in golf- as if I should be embarrassed that I enjoy such a snobby pastime.  2. Just swing your swing- and stop obsessing about getting a "pretty" swing. Sorry, but that's not sound advice - when I get rid of the key elements that are holding me back, yes, sure- then I'll make the most of what I've got. I'll swing that swing. Until then, not a chance, now that I have learned about the fundamentals . There's work to be done to make my future golf far more enjoyable and competitive.   3. Lessons are expensive. Nope- look hard enough and you can find quality swing guidance at a reasonable price.  I agree with lotsa others above but these resonate for me at my level of play right and interactions with people now. 
    • Let me address the things you mentioned and clarify a little bit, because I think there is some misconceptions on some of the aspects. There is perception that the Titleist fitting covers everything and the Bridgestone only addresses the driver.  One of the biggest issues I have with the Titleist method is it's not a real golf ball fitting.  They give you a Pro V1 2-ball pack and a Pro V1x 2-ball pack and basically tell you to hit some shots and see which one you like best. So regardless of swing speed, handicap, launch numbers or anything else, they are saying you can pick this ball or that ball.  The other models in Titleist's line are not included and competitor models are not included.  I know for a fact that there are many players who don't fit into either of those models, but Titleist doesn't offer other options or comparisons.  They claim the Pro V1 and Pro V1x have the best distance, best short game spin, best flight characteristics, softest feel and great durability.  I hate to tell everyone, but there is no such thing as a perfect golf ball.  The laws of physics and aerodynamics apply to Titleist just like everyone else.  A ball that is designed for high spin will not be as long as a lower spinning model and will tend to curve more, and a ball designed for distance will not have the same type of performance on approach shots and around the green. Titleist also doesn't offer any data that shows how those models stack-up for players, or how they perform compared to their ideal numbers.  Sure, people love the spin that they get around the green, but do they need that much spin?  Is all that spin hurting them in other areas?  High spin actually gets a lot of players in trouble and costs them more strokes than it saves them.  Similar to the Titleist method that has players go through the process on their own, after a Bridgestone tech works with a player and their driver and shows them the data, a 2-ball pack is given to the player to continue their testing on the course with irons and short game.  As far as the number of shots on the launch monitor is concerned, you are correct...typically 3 or 4 shots with each ball is recorded.  It's not a lot, but it's 6-8 more shots over a launch monitor than a Titleist fitting. Obviously it would be great to do more, but a fitting could easily stretch to an hour per player, so a typical 4-5 hour event we could only help a handful of players.  A normal fitting takes about 15 min, so that is 16-20 players per event.  At that number, the cost of each fitting was right around $40/player.  If an hour was spent with each player, it would cost almost $200/player which isn't cost effective. On the launch angle issue, what I said was there are many things that can affect the launch, including the ball.  I didn't say 2* wasn't possible and I didn't say in the example I posted that only 1/2* could be attributed to the ball.  Honestly, I can't say how much of that 2* is related to moving to a different model...even if other variables like tee height, ball position were removed, the difference in loft will vary from player-to-player due to different swing speeds, swing paths, angle of attack etc which is unique to everyone.  Plus depending on what model is used first and which model is recommended could have a smaller or larger affect than other combinations.  You could probably make the same case for every category if you wanted though, right?  You could say how much of the difference in spin was caused by the ball change and how much was the result of some other variable?  Spin is more important than the launch angle, so even if the l.a. stayed the same, the drop in spin would have made a nice difference by itself.  But we know the player was launching the ball too low with too much spin, a lower spinning/higher launching ball was recommended and the results were a more efficient trajectory and an increase in performance. I believe the key is to be able to show a player in black and white what their launch conditions are with their current ball and how it compares to their ideal numbers.  If you can't show a player the areas that need improvement, then how can you confidently recommend the best ball for them?  The truth is, most people are playing the wrong ball, so it's not that hard to make an improvement, and honestly there are probably a handful of different makes/models that would be better.    
    • 1-5. Putting matters most. Uh huh. What are the chances I gain 2 strokes because I (or just about any golfer) 4 putted? It's happened. Rarely. What are the chances I (or just about any golfer) hit an errant tee shot and blow 2 strokes? 40% every tee shot for me. 
  • TST Blog Entries

  • Blog Entries

  • Today's Birthdays

    1. Dragondrake
      Dragondrake
      (57 years old)
    2. Mistabigevil
      Mistabigevil
      (36 years old)
    3. Taylor56
      Taylor56
      (61 years old)
  • Get Great Gear with Amazon