One of the (few) downsides of being in the Bay is the completely absurd traffic. Perhaps I am a bit more sensitive to this than most, given my epic daily commute. While I am normally inclined to whine about my cross-bay traverse a little, yesterday it paid off. You see, I’m used to making sense […]
The Quick and Dirty: For the moment let’s assume that we have some a priori hypothesis, and we want to test. We can talk about two things: how big the relationship is and how strong it is. P-values don’t care about big – they only care about strong. To get a sense for this recall […]
Chocolate Cake (Wednesday, June 5, 2013) You know how sometimes you have one bite of really good chocolate cake, or a really amazing peach and totally assume that you could eat another 30lbs of whatever without regard for good manners or physical limitations? Yeah. Decreasing marginal returns dictate that it almost always turns out that […]
Finding myself at 0x is a lot less like starting fresh in a new profession and more like choosing cultural expatriation – it is a whole new (beautiful) world. On my first day everyone spoke what I was relatively sure should be English, but it felt like they were actually speaking in their own dialect […]
Come watch Jan Vitek present Distributed Random Forest at SF Data Mining group.
In this double header we present a practitioners close view of the science and an engineer’s close view of design and implementation of distributed algorithm. Day in the Life of a Data Scientist – Chris Pouliot In this session, Netflix analytical leader Chris Pouliot shares his experience building a large team of data scientists at […]
Manhattan loves data + math better than any one! Join us on our first New York City meetup talking high-scale algos at Pivotal Labs, Union Sq, NYC Cliff and I will walk through a Big GLM over large datasets and deep dive in parallelizing and distributing algorithms over distributed array-let datastructures.