Discussion:
Ruby Quiz - Challenge #4 - Turn Humanitarian eXchange Language (HXL) Tabular Records into Named Tuples
Gerald Bauer
2018-11-09 07:19:11 UTC
Permalink
Hello,

Let's honor the tradition and post a new Ruby Quiz [1] every Friday.
Here we go:

Challenge #4 - Turn Humanitarian eXchange Language (HXL) Tabular
Records into Named Tuples

Let's turn tabular data using the Humanitarian eXchange Language (HXL)
hashtag convention from array of array of strings
to array of named tuples (also known as hash dictionaries).


Aside: What's Humanitarian eXchange Language (HXL)?

Humanitarian eXchange Language (HXL) [2]
is a (meta data) convention for
adding agreed on hashtags e.g. `#org,#country,#sex+#targeted,#adm1`
inline in a (single new line/ row) between the last header row and the
first data row for sharing tabular data across organisations
(during a humanitarian crisis).
Example:

```
What,,,Who,Where,For whom,
Record,Sector/Cluster,Subsector,Organisation,Country,Males,Females,Subregion
,#sector+en,#subsector,#org,#country,#sex+#targeted,#sex+#targeted,#adm1
001,WASH,Subsector 1,Org 1,Country 1,100,100,Region 1
002,Health,Subsector 2,Org 2,Country 2,,,Region 2
003,Education,Subsector 3,Org 3,Country 2,250,300,Region 3
004,WASH,Subsector 4,Org 1,Country 3,80,95,Region 4
```


The challenge: Code a parse method that passes the RubyQuizTest :-).

```
def parse( recs )
# ...
end
```

For the starter level 1 turn:

```
parse( [["Organisation", "Cluster", "Province" ],
[ "#org", "#sector", "#adm1" ],
[ "Org A", "WASH", "Coastal Province" ],
[ "Org B", "Health", "Mountain Province" ],
[ "Org C", "Education", "Coastal Province" ],
[ "Org A", "WASH", "Plains Province" ]] )
```

into

```
[{"org" => "Org A", "sector" => "WASH", "adm1" => "Coastal Province"},
{"org" => "Org B", "sector" => "Health", "adm1" => "Mountain Province"},
{"org" => "Org C", "sector" => "Education", "adm1" => "Coastal Province"},
{"org" => "Org A", "sector" => "WASH", "adm1" => "Plains Province"}]
```


Bonus: For a greater level 2 challenge with three extra rules:

- Skip / ignore extra header rows (e.g. one or more rows before
hashtag line / row)
- Skip / ignore "untagged" fields / columns (e.g. `""`) in the hashtag
line / row in the named tuple hash dictionary
- Fold repeat (duplicate) fields / columns (e.g. `#sex+#targeted`)
into a list / array


Turn:

```
parse( [["What","","","Who","Where","For whom",""],
["Record","Sector/Cluster","Subsector","Organisation","Country","Males","Females","Subregion"],
["","#sector+en","#subsector","#org","#country","#sex+#targeted","#sex+#targeted","#adm1"],
["001","WASH","Subsector 1","Org 1","Country 1","100","100","Region 1"],
["002","Health","Subsector 2","Org 2","Country 2","","","Region 2"],
["003","Education","Subsector 3","Org 3","Country
2","250,300","Region 3"],
["004","WASH","Subsector 4","Org 1","Country 3","80","95","Region 4"]] )
```

into

```
[{"sector+en" => "WASH",
"subsector" => "Subsector 1",
"org" => "Org 1",
"country" => "Country 1",
"sex+targeted" => ["100", "100"],
"adm1" => "Region 1"},
{"sector+en" => "Health",
"subsector" => "Subsector 2",
"org" => "Org 2",
"country" => "Country 2",
"sex+targeted" => ["", ""],
"adm1" => "Region 2"},
{"sector+en" => "Education",
"subsector" => "Subsector 3",
"org" => "Org 3",
"country" => "Country 2",
"sex+targeted" => ["250", "300"],
"adm1" => "Region 3"},
{"sector+en" => "WASH",
"subsector" => "Subsector 4",
"org" => "Org 1",
"country" => "Country 3",
"sex+targeted" => ["80", "95"],
"adm1" => "Region 4"}]
```


Start from scratch or, yes, use any library / gem you can find.

To qualify for solving the code challenge / puzzle you must pass the test [3]

Post your code snippets on the "official" Ruby Quiz Channel,
that is, the ruby-talk mailing list (right here).

Comments and discussion more than welcome.

Happy hacking and data wrangling with Ruby.



[1] https://github.com/planetruby/quiz
[2] https://github.com/csvspecs/csv-hxl
[3] https://github.com/planetruby/quiz/blob/master/004/test.rb

Unsubscribe: <mailto:ruby-talk-***@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>
Frank J. Cameron
2018-11-11 02:13:58 UTC
Permalink
Post by Gerald Bauer
Challenge #4 - Turn Humanitarian eXchange Language (HXL) Tabular
Records into Named Tuples
$ ruby ./lib/004.rb
Run options: --seed 28092
# Running:
.F
Finished in 0.005780s, 346.0336 runs/s, 346.0336 assertions/s.
1) Failure:
RubyQuizTest#test_level2 [/home/fjc/ruby-quiz/004/test.rb:71]:
--- expected
+++ actual
@@ -1 +1 @@
-[{"sector+en"=>"WASH", "subsector"=>"Subsector 1", "org"=>"Org 1",
"country"=>"Country 1", "sex+targeted"=>["100", "100"], "adm1"=>"Region
1"}, {"sector+en"=>"Health", "subsector"=>"Subsector 2", "org"=>"Org 2",
"country"=>"Country 2", "sex+targeted"=>["", ""], "adm1"=>"Region 2"},
{"sector+en"=>"Education", "subsector"=>"Subsector 3", "org"=>"Org 3",
"country"=>"Country 2", "sex+targeted"=>["250", "300"], "adm1"=>"Region
3"}, {"sector+en"=>"WASH", "subsector"=>"Subsector 4", "org"=>"Org 1",
"country"=>"Country 3", "sex+targeted"=>["80", "95"], "adm1"=>"Region 4"}]
+[{"sector+en"=>"WASH", "subsector"=>"Subsector 1", "org"=>"Org 1",
"country"=>"Country 1", "sex+targeted"=>["100", "100"], "adm1"=>"Region
1"}, {"sector+en"=>"Health", "subsector"=>"Subsector 2", "org"=>"Org 2",
"country"=>"Country 2", "sex+targeted"=>["", ""], "adm1"=>"Region 2"},
{"sector+en"=>"Education", "subsector"=>"Subsector 3", "org"=>"Org 3",
"country"=>"Country 2", "sex+targeted"=>["250,300", "Region 3"],
"adm1"=>nil}, {"sector+en"=>"WASH", "subsector"=>"Subsector 4",
"org"=>"Org 1", "country"=>"Country 3", "sex+targeted"=>["80", "95"],
"adm1"=>"Region 4"}]
2 runs, 2 assertions, 1 failures, 0 errors, 0 skips

$ cat ./lib/004.rb
require_relative '../004/test.rb'
class RubyQuizTest
def parse(recs)
header_index = recs.find.with_index do |rec, i|
break i if rec.any?{|r| r[0] == ?# }
end
header_rec = recs[header_index].map do |r|
r.delete!(?#)
end
recs[header_index+1..-1].map do |rec|
{}.tap do |rec_hash|
header_rec.zip(rec) do |(h,r)|
next unless h
if rec_hash.has_key?(h)
if rec_hash[h].instance_of?(Array)
rec_hash[h] << r
else
rec_hash[h] = [rec_hash[h]] << r
end
else
rec_hash[h] = r
end
end
end
end
end
end
RubyQuizTest.new('fjc')



Unsubscribe: <mailto:ruby-talk-***@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>
Gerald Bauer
2018-11-11 08:35:26 UTC
Permalink
Hello,
Thanks for the first HXL parser solution (even with a handicap).
Sorry for the typo (fixed now) - changed "250,300" to "250","300" in
the expected records result in level 2.

I've added your snippet to the solultion.rb [1] script:

Finished in 0.007808s, 256.1529 runs/s, 256.1529 assertions/s.
2 runs, 2 assertions, 0 failures, 0 errors, 0 skips

Have a great weekend. Cheers. Prost.

[1] https://github.com/planetruby/quiz/blob/master/004/solution.rb

Unsubscribe: <mailto:ruby-talk-***@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Loading...