Tutorial big data analysis: Weather changes in the Carpathian-Basin from 1900 to 2014 – Part 6/9
Manipulating output data with the Linux SED command – SED example
This Kartograph tutorial uses JSON as dataformat, so I needed the same format for my own data that the tutorial uses – example:
[{"Weather": "ARAD RO", "ll": [21.35, 46.1331], "1882": 742.0 ? {"Weather": "MURSKA SOBOTA RAKICAN SI", "ll": [16.2, 46.7], ? "1962": 931.0}]
And the resulting data from PIG is not compatible with it, as it have different markup, presented here:
((ARAD RO,46.1331,21.35),{((1882,742)),((1883,680)),((1884,656)),((1885,656)),((1886,770)),((1887,718)),((1888,467)),((1889,893)),((1890,570)), ? 92))}) ((DEVA RO,45.8667,22.9),?
So I?ve used the SED Linux command to alter the resulting dataset from Pig
SED example
Sed command 1
sed 's/(([A-Z ]*),([0-9.]*),([0-9.]*))/[{"Weather": "1", "ll: [3, 2], /g' rain_orig.csv > rain_new.csv && cat rain_new.csv
This command checks for ?(any number of capital characters)?, followed by ?,? and ?(any number of 0-9 numbers or ?.?)? blocks of two separated by a ?,? so is valid for
?((ARAD RO,46.1331,21.35)?
and changes this string to
[{“Weather”: ?ARAD RO?, “ll”: [21.35, 46.1331]
The syntax:
sed 's/ / /g' input > output
for each string of the input find:
regular expression / change it to something.
1, 2 in SED Command 1 is a link to the regular expressions defined in the first part, in our case, 1 refers to ARAD RO, which is encapsulated in ? ? marks
I parse rain_orig.csv, the output of PIG and output it to rain_new.csv
Sed command 2
sed 's/{((([0-9]*),/"1": /g' rain_new.csv > rain_new2.csv && cat rain_new2.csv
The followings are set up as the above, parsing for {((number, and changing it to ?number? ? valid for years
Sed command 3
sed 's/)),((([0-9]*),/, "1":/g' rain_new2.csv > rain_new3.csv && cat rain_new3.csv
While this one parses for )),((numbers), and changes it to numbers
So after running all the 3 SED commands we have the needed results in the format of
<b>[{"Weather": "ARAD RO", "ll": [21.35, 46.1331], "1882": 742.0, "1883": 680.0, ?</b>
One manual task left is to properly close the encapsulation at the end of the JSON file by changing }, to }]
The JSON for the Map is available now, and I?ve saved it to the directory of the new map visualization example.