Sunday, October 16, 2011

The best programming language in regard of library availability, ease to find/install a new library and community support

I found a question on stackoverflow about
What is a comparison of famous programming languages in regard of those aspects?
And I thought answering the more specific "Which is the best programming language in regard of library availability, ease to find/install a new library and community support?".

The answer is quire obvious:

Perl is hegemonic


  • it has a single point of access to ALL its libraries - CPAN
  • its description is more than compelling saying that The Comprehensive Perl Archive Network (CPAN) currently has 100,462 Perl modules in 23,551 distributions, written by 9,266 authors, mirrored on 269 servers. The archive has been online since October 1995 and is constantly growing.
  • finding modules requires just a search on http://search.cpan.org/
cpan Module::Name

 at the command line being sufficient
  • the community is simply awesome, the fastest and most efficient way to get help is to go on irc on the perl irc channels and ask there.
  • I have also seen some of the world perl gurus activating on stackoverflow
  • beside all of these, perl has cpan testers which provides multi-platform testing for modules. Today there are over 15 million tester reports and more than 100 testers each month giving valuable feedback for users and authors alike.

Regarding other programming languages, I think that Ruby is the one closest to copying the CPAN model

Friday, October 14, 2011

MySQL `FORCE INDEX` best practices?

Hi,

Almost everywhere I read the use of FORCE INDEX is highly discouraged and I perfectly understand and know why - there are immensely huge chances that MySQL knows better what indexes to choose than the (average) developer.

However, recently I have found a case where FORCE INDEX improved my execution times in the range of hundred of times:


  •  JOIN on 4 tables
  •  first table has about 500 000 records
  •  2 of the INNER JOINed table have more than 1 milion records
  •  first table has a field called published_date, stored as varchar in YMD format (could not changed to datetime)
  •  needed a range on published_date of at most 5 000 records
  •  some other GROUP BY and ORDER BY clauses on the first table on different fields than published_date were needed for this query


Although I've rewritten the query in many ways, I was not able to get execution times smaller than 130 seconds (with highest being over 700 ). After using FORCE INDEX with published_date, the execution time dropped below 5 seconds.

It took me a few days to remember about the infamous FORCE INDEX option.

Based on suggestions I received on StackOverflow, I have issued ANALIZE TABLE and OPTIMIZE TABLE on the annoing table that does not want to use the index - unfortunately, it stiil doesn't use that index, so for now FORCE INDEX remains the only viable option. Below is the output for EXPLAIN after each step I took in my tryings to optimize the table:

 without FORCE INDEX on table with alias a  
 id     select_type     table     type     possible_keys     key     key_len     ref     rows     Extra  
 1     SIMPLE     am2     range     PRIMARY,idx_meta_article     idx_meta_article     4     NULL     275228     Using where; Using index; Using temporary; Using f...  
 1     SIMPLE     a     eq_ref     PRIMARY,serial_issue_date_productid,pub_date,idx_d...     PRIMARY     4     mydb_toto.am2.ArticleID     1     Using where  
 1     SIMPLE     ai     ref     PRIMARY,idx_iso_article     PRIMARY     4     mydb_toto.a.serial     11523     Using where; Using index  
 1     SIMPLE     m     range     PRIMARY,meta_articles_type     meta_articles_type     4     NULL     96     Using where  
 1     SIMPLE     am     eq_ref     PRIMARY,idx_meta_article     PRIMARY     8     mydb_toto.a.serial,mydb_toto.m.meta_id     1     Using where; Using index  
 with FORCE INDEX on table with alias a  
 id     select_type     table     type     possible_keys     key     key_len     ref     rows     Extra  
 1     SIMPLE     a     range     pub_date     pub_date     11     NULL     17679     Using where; Using temporary; Using filesort  
 1     SIMPLE     am2     ref     PRIMARY,idx_meta_article     PRIMARY     4     mydb_toto.a.serial     21930     Using where; Using index  
 1     SIMPLE     ai     ref     PRIMARY,idx_iso_article     PRIMARY     4     mydb_toto.a.serial     11523     Using where; Using index  
 1     SIMPLE     m     range     PRIMARY,meta_articles_type     meta_articles_type     4     NULL     96     Using where  
 1     SIMPLE     am     eq_ref     PRIMARY,idx_meta_article     PRIMARY     8     mydb_toto.am2.ArticleID,mydb_toto.m.meta_id     1     Using where; Using index  
 after ANALYZE TABLE, without FORCE INDEX:  
 id     select_type     table     type     possible_keys     key     key_len     ref     rows     Extra  
 1     SIMPLE     am2     range     PRIMARY,idx_meta_article     idx_meta_article     4     NULL     275228     Using where; Using index; Using temporary; Using f...  
 1     SIMPLE     a     eq_ref     PRIMARY,serial_issue_date_productid,pub_date,idx_d...     PRIMARY     4     mydb_toto.am2.ArticleID     1     Using where  
 1     SIMPLE     ai     ref     PRIMARY,idx_iso_article     PRIMARY     4     mydb_toto.a.serial     11523     Using where; Using index  
 1     SIMPLE     m     range     PRIMARY,meta_articles_type     meta_articles_type     4     NULL     96     Using where  
 1     SIMPLE     am     eq_ref     PRIMARY,idx_meta_article     PRIMARY     8     mydb_toto.a.serial,mydb_toto.m.meta_id     1     Using where; Using index  
 after OPTIMIZE TABLE, without FORCE INDEX:  
 id     select_type     table     type     possible_keys     key     key_len     ref     rows     Extra  
 1     SIMPLE     am2     range     PRIMARY,idx_meta_article     idx_meta_article     4     NULL     275228     Using where; Using index; Using temporary; Using f...  
 1     SIMPLE     a     eq_ref     PRIMARY,serial_issue_date_productid,pub_date,idx_d...     PRIMARY     4     mydb_toto.am2.ArticleID     1     Using where  
 1     SIMPLE     ai     ref     PRIMARY,idx_iso_article     PRIMARY     4     mydb_toto.a.serial     11523     Using where; Using index  
 1     SIMPLE     m     range     PRIMARY,meta_articles_type     meta_articles_type     4     NULL     96     Using where  
 1     SIMPLE     am     eq_ref     PRIMARY,idx_meta_article     PRIMARY     8     mydb_toto.a.serial,mydb_toto.m.meta_id     1     Using where; Using index  
 after OPTIMIZE TABLE and ANALYZE TABLE, with FORCE INDEX  
 id     select_type     table     type     possible_keys     key     key_len     ref     rows     Extra  
 1     SIMPLE     a     range     pub_date     pub_date     11     NULL     17679     Using where; Using temporary; Using filesort  
 1     SIMPLE     am2     ref     PRIMARY,idx_meta_article     PRIMARY     4     mydb_toto.a.serial     21930     Using where; Using index  
 1     SIMPLE     ai     ref     PRIMARY,idx_iso_article     PRIMARY     4     mydb_toto.a.serial     11523     Using where; Using index  
 1     SIMPLE     m     range     PRIMARY,meta_articles_type     meta_articles_type     4     NULL     96     Using where  
 1     SIMPLE     am     eq_ref     PRIMARY,idx_meta_article     PRIMARY     8     mydb_toto.am2.ArticleID,mydb_toto.m.meta_id     1     Using where; Using index  

Questions:


  •  What are other use cases you found where FORCE INDEX saved you?
  • Do you have some best practices when you consider using FORCE INDEX?