Thursday, October 13, 2011

Internationalize a seed file


A seed file is a file that contains data your app needs to function. It's a better way to manage that data than keeping "canonical" copies of databases hanging around. I took internationalizing this seed file as an opportunity to flex some command-line muscle. Starting with this file...


Foo.fast_bootstrap(
  {id:2,  key: "General",                     position: 1},
  {id:15, key: "Endocrine",                   position: 2},
  {id:12, key: "Skin",                        position: 3},
  {id:14, key: "Musculoskeletal",             position: 4},
  {id:6,  key: "Neurological",                position: 5},
  {id:16, key: "Psychological",               position: 6},
  {id:7,  key: "Eyes",                        position: 7},
  {id:8,  key: "Ears",                        position: 8},
  {id:9,  key: "Mouth, Nose & Throat",        position: 9},
  {id:13, key: "Breast/Chest",                position: 10},
  {id:3,  key: "Lungs & Breathing",           position: 11},
  {id:4,  key: "Heart, Blood & Circulation",  position: 12},
  {id:5,  key: "Digestive/Gastrointestinal",  position: 13},
  {id:17, key: "Urinary",                     position: 14},
  {id:18, key: "Men",                         position: 16},
  {id:19, key: "Women",                       position: 17},
  'id'
)

...and using this command...

cat db/seeds/foo.rb | cut -d '"' -f 2 | head -17 | tail -16 | ruby -e "s = *ARGF; s.each { |s| puts s.downcase.chomp.gsub('&', '').gsub('/', ' ').gsub(',', '').squeeze(' ').gsub(' ', '_') + ': ' + '\"' + s.chomp + '\"'}"

...we end up with something that's ready to paste into a YML file!


general: "General"
endocrine: "Endocrine"
skin: "Skin"
musculoskeletal: "Musculoskeletal"
neurological: "Neurological"
psychological: "Psychological"
eyes: "Eyes"
ears: "Ears"
mouth_nose_throat: "Mouth, Nose & Throat"
breast_chest: "Breast/Chest"
lungs_breathing: "Lungs & Breathing"
heart_blood_circulation: "Heart, Blood & Circulation"
digestive_gastrointestinal: "Digestive/Gastrointestinal"
urinary: "Urinary"
men: "Men"
women: "Women"

Here's the breakdown of what the command does:

cut -d '"' -f 2

Cut the file at each ", then give me the things in the second section post-cutting.

head -17 | tail -16

Give me the first 17 lines, then the last 16 lines. This trims the last two lines and the first line from our text. TODO: figure out how to do this dynamically instead of counting lines.

ruby -e...

This allows us to drop into Ruby to do the rest of our string manipulation. The -e signals Ruby that we're going to pass it a string, not a file.

s = *ARGF;

*ARGF is the data getting piped in. In this case, Ruby interprets it as an array of strings.

s.each { |s| puts s.downcase.chomp.gsub('&', '').gsub('/', ' ').gsub(',', '').squeeze(' ').gsub(' ', '_') + ': ' + '\"' + s.chomp + '\"'}

Now we manipulate the strings and output them so we can copy-paste from the terminal. TODO: Extract everything before the first + into a "yml_keyify" method. We replace the upper-case letters, remove the trailing \n character, get rid of any ampersands, slashes, and commas, reduce any instances of multiple spaces into one space, and turn our spaces into underscores. Whew! After that, we just add the colon and output the original string (surrounded by ") as the value of our YML key-value pair.


Now to replace the bad keys within the seed file, we use:


cat db/seeds/foo.rb | ruby -e "s = *ARGF; s.each { |s| in_quotes = s.match(/\"{1}(.*)\"{1}/); next unless in_quotes; bad_key = in_quotes[1]; puts s.gsub(bad_key, bad_key.downcase.chomp.gsub('&', '').gsub('/', ' ').gsub(',', '').squeeze(' ').gsub(' ', '_')) }"


a little copy-paste and whitespace fixing and our new seed file looks like:



Foo.fast_bootstrap(
  {id:2,  key: "general",                     position: 1},
  {id:15, key: "endocrine",                   position: 2},
  {id:12, key: "skin",                        position: 3},
  {id:14, key: "musculoskeletal",             position: 4},
  {id:6,  key: "neurological",                position: 5},
  {id:16, key: "psychological",               position: 6},
  {id:7,  key: "eyes",                        position: 7},
  {id:8,  key: "ears",                        position: 8},
  {id:9,  key: "mouth_nose_throat",           position: 9},
  {id:13, key: "breast_chest",                position: 10},
  {id:3,  key: "lungs_breathing",             position: 11},
  {id:4,  key: "heart_blood_circulation",     position: 12},
  {id:5,  key: "digestive_gastrointestinal",  position: 13},
  {id:17, key: "urinary",                     position: 14},
  {id:18, key: "men",                         position: 16},
  {id:19, key: "women",                       position: 17},
  'id'
)


Much better, and ready for localization!


For a second, more complex version of the same program, I wrote this script: https://gist.github.com/1285825

No comments:

Post a Comment