June 19th, 2013

Convert DOS to Unix – Insert Tab A into Slot B

RSS icon RSS Category: Uncategorized [EN]
Fallback Featured Image

Every day as part of my 0x immersion program one of our hackers tries to explain something he is working on –  an especially beautiful bit of code or something about data science and how the mechanics of our project work, or whatever.  Every day, at least once, I am completely confused. I realize that this must be exactly how someone who has never had a statistics class must feel sometimes when we talk about analysis.

Anyhow, today I spent a shameful amount of time taking the hardest path possible to figuring out this data for a submission to Kaggle. Specifically, before I could even begin to look at the data, I had to tinker with the file. Of course it’s like 50,000 observations – huge for a social scientist, small for a corporate analyst, and more geared toward small data tools than big ones. I read the file into R, hit enter, and… radio silence. If you upload the same into H2O, there is zero problem. I totally assumed the source of the issue was me (it still may be).

While H2O will inhale and parse anything, Tom taught me some handy code for converting files that were born in DOS (and for whatever random reason won’t work properly on my mac) to Unix. Functioning under the assumption that not all 5 of the people who read my blog are code hackers, I’ll start with the very basics.
In terminal make sure you are in the right directory – the right directory is the directory where you have  put the file that will parse in H2O, but not in R (this may go without saying, but seriously, I totally forget this on a regular basis and as a result got to learn the technical term “drop a turd” this evening).
Here’s your instruction line: perl -pe ‘s/\r\n|\n|\r/\n/g’   inputfile > outputfiletest.  Specify the input file (the troublesome file you would like to fix), and give it a name you will recognize for outputfiletest. And voila. This has the caveat of working on DOS to UNIX, but if Microsoft isn’t the source of your sadness, this probably won’t work, and the aforementioned help won’t help you. Even so, if I find anything else out, I will definitely share.

Leave a Reply

+
A Brief Overview of AI Governance for Responsible Machine Learning Systems

Our paper “A Brief Overview of AI Governance for Responsible Machine Learning Systems” was recently

November 30, 2022 - by Navdeep Gill, Abhishek Mathur and Marcos V. Conde
+
H2O World Dallas Customer Talks

After three long years of not having an #H2OWorld, we finally held our first one

November 24, 2022 - by Vinod Iyengar
+
New in Wave 0.24.0

Another Wave release has arrived with quite a few exciting new features. Let's quickly go

November 21, 2022 - by Martin Turoci
Fallback Featured Image
+
H2O.ai Raises $40 Million to Democratize Artificial Intelligence for the Enterprise

Series C round led by Wells Fargo and NVIDIA MOUNTAIN VIEW, CA – November 30, 2017

November 20, 2022 - by
+
H2O.ai Placed Furthest in Completeness of Vision in 2021 Gartner Data Science and Machine Learning Magic Quadrant in the Visionaries Quadrant. — Copy

At H2O.ai, our mission is to democratize AI, and we believe driving value from data

November 18, 2022 - by Read Maloney, SVP of Marketing
+
H2O.ai Expands Market Footprint in Healthcare AI by Signing Hackensack Meridian Health and Other Key Providers

We’re excited to attend the HLTH conference this week in Las Vegas, NV. This industry

November 14, 2022 - by Prashant Natarajan

Start Your Free Trial