Due to the bad weather, cold temperatures, and painfully cold windchill our President Lou Anna K. Simon wisely closed our alma mater – a very good decision in my opinion. First of all we can all enjoy a longer vacation, don’t get frost bite, die in a car accident, and most importantly we can prepare our lectures better. In my case that is “CSE 891: Computational Techniques for Large-Scale Data Analysis for MSBA”.
One of the things I try to use is current and interesting real world examples, so I am constantly on the lookout for “big data problems’. With this in mind I found it quite interesting to see that after the first announcement at 6pm Sunday to close MSU for the entire Monday, we got another announcement on Monday 6pm closing MSU until noon on Tuesday. And low and behold, I just received another mail at 11am this Tuesday that indeed MSU will close until 5pm.
The trained data analyst realizes two things immediately: Three data points allow a fit, and the intervals become shorter and the time to reopen MSU become closer as well – hallmark of exponential decay!
The first thing we have to assume here is that there will be further announcements that predict MSU to reopen less and less far in the future until we get a final email explaining:”MSU opens NOW!” But can we predict this point using our big data analysis tools and methods?
Let us first scrape the data from my mail inbox. I used a simple pen and paper method and computed everything in my brain. I was tempted to write a web crawler to get the data from www.msu.edu or maybe send all my mails regarding the matter to amazon mechanical turk to crowd source the issue, but one truth always holds: If coding takes longer then solving it by hand, you shall not code (except for training purposes of cause)! This gives this initial plot:
with the announcement times on the x-axis and the time between the announcement and the potential reopening of MSU on the y-axis. I have to point out that I did some text analysis to interpolate the first datapoint. Closed on Monday doesn’t necessarily specify a time to open so I estimated midnight.
We con now fit the data to an exponential decay function:
We now find a and b and can check the fit to the data:
Great! From this we can now estimate when MSU will finally be open again. Technically this function will never drop to zero, but I don’t expect to receive ever faster arriving emails that predict an ever smaller reopening in the future – similar to Zeno and his turtle. I estimate that at some point when the delay drops below say 5 minutes we should receive the final email, which according to this plot
falls “exactly” to Wednesday 10am – what a surprise, my lecture starts at 10.20am!