{"id":19,"date":"2011-09-22T18:34:34","date_gmt":"2011-09-23T02:34:34","guid":{"rendered":"http:\/\/www.dotcomdotat.com\/blog\/?p=19"},"modified":"2011-09-22T18:34:34","modified_gmt":"2011-09-23T02:34:34","slug":"whenever-possible-do-it-once","status":"publish","type":"post","link":"http:\/\/www.dotcomdotat.com\/blog\/2011\/09\/whenever-possible-do-it-once\/","title":{"rendered":"Whenever possible: Do it once"},"content":{"rendered":"<p>We&#8217;ve been trying to deliver a custom report for a client using another system that has been long neglected. Of course we were also trying to do it in a way the original designers did not plan for. In a good state it was taking 3.5 hours to run. In a bad, up to 22. The client had a deadline and was getting anxious. Time to dive under the hood and see what could be done.<\/p>\n<p>There were a few steps I took to improving things including moving the process to a new server and stopping unnecessary filters from appearing (ie. filtering to allow all possible cases). The new server obviously helped a large amount but we were still looking at a 45 minute &#8211; 1 hour process and it would still take weeks to deliver all the data.<\/p>\n<p>In the end I got it down to 6 minutes. What made the final big difference? Fixing the regexes.<\/p>\n<p>This time it was a bit of an experiment and I really wasn&#8217;t sure it would work. We had a large number of values to filter* for but as long as one appeared in each row of data the row should be retrieved. Due to the way the original designers implemented the filtering it was running a single regex check for each value. It was returning as soon as one was found, but in this case it was still an incredible amount of regexes. I rewrote it so all values were done in one OR class. Suddenly the report ran smooth as silk.<\/p>\n<p>It&#8217;s a maxim I always try to observe and recently have seen it&#8217;s usefulness time and time again &#8211; whenever possible: do it once. Write to disk once, search once, calculate once, and loop as little as possible. If you could do it this way and you aren&#8217;t then you&#8217;re wasting cycles.<\/p>\n<p>In this particular case there is a possible issue of too many cases within the single regex and that was my concern. Luckily so far it hasn&#8217;t been a problem.<\/p>\n<p>&nbsp;<\/p>\n<p>* I suppose I should mention that our current setup for this particular data does not involve a database. Currently I&#8217;m trying to look into Hadoop and Hive to solve many problems.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We&#8217;ve been trying to deliver a custom report for a client using another system that has been long neglected. Of course we were also trying to do it in a way the original designers did not plan for. In a good state it was taking 3.5 hours to run. In a bad, up to 22. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-19","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"http:\/\/www.dotcomdotat.com\/blog\/wp-json\/wp\/v2\/posts\/19","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.dotcomdotat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.dotcomdotat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.dotcomdotat.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.dotcomdotat.com\/blog\/wp-json\/wp\/v2\/comments?post=19"}],"version-history":[{"count":1,"href":"http:\/\/www.dotcomdotat.com\/blog\/wp-json\/wp\/v2\/posts\/19\/revisions"}],"predecessor-version":[{"id":20,"href":"http:\/\/www.dotcomdotat.com\/blog\/wp-json\/wp\/v2\/posts\/19\/revisions\/20"}],"wp:attachment":[{"href":"http:\/\/www.dotcomdotat.com\/blog\/wp-json\/wp\/v2\/media?parent=19"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.dotcomdotat.com\/blog\/wp-json\/wp\/v2\/categories?post=19"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.dotcomdotat.com\/blog\/wp-json\/wp\/v2\/tags?post=19"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}