Any project involving data requires a specific format. Visualization libraries such at ggplot2 or matplotlib work with specific types of data. Any modeling or prediction is going to require a specific format. Most of the time a project requires several iterations of plotting or analysis, so data munging is a skill that you’ll use a LOT.
Tag Archives: regex
Cleaning URL Data in PowerShell
As a analyst working within the marketing world, most of the data that I see includes URLs. In fact, much of the data is focused on URLs. I tend to scrub URLs a lot, and have collected a hand full of scripts to help me. Hopefully this will help you as well. I’ve included references where appropriate.