Saturday, 31 August 2013

What is best Ruby Class design / pattern for this scenario?

What is best Ruby Class design / pattern for this scenario?

I currently have this class for scraping products from a single retailer
website using Nokogiri. XPath, CSS path details are stored in MySQL.
ActiveRecord::Base.establish_connection(
:adapter => "mysql2",
...
)
class Site < ActiveRecord::Base
has_many :site_details
def create_product_links
# http://www.example.com
p = Nokogiri::HTML(open(url))
p.xpath(total_products_path).each {|lnk|
SiteDetail.find_or_create_by(url: url + "/" + lnk['href'], site_id:
self.id)}
end
end
class SiteDetail < ActiveRecord::Base
belongs_to :site
def get_product_data
# http://www.example.com
p = Nokogiri::HTML(open(url))
title = p.css(site.title_path).text
price = p.css(site.price_path).text
description = p.css(site.description_path).text
update_attributes!(title: title, price: price, description: description)
end
end
I will be adding more sites (around 700) in the future. Each site have a
different page structure. So get_product_data method cannot be used as is.
I may have to use case or if statement to jump and execute relevant code.
Soon this class becomes quite chunky and ugly (700 retailers).
What is the best design approach suitable in this scenario?

No comments:

Post a Comment