March
16

Who loves data? ... I can't hear you! Who loves data? Everyone loves data, and the Government has lots of it. In this case, I'm referring to the Government of Canada. Celebrating its one year anniversary, the Open Data Pilot Project is, well, one  year old now.  I would like to thank David Eaves for promoting it via his blog (Sharing ideas about data.gc.ca; http://eaves.ca/2012/03/15/sharing-ideas-about-data-gc-ca).

This is a very large topic; it is too large to cover in a single blog post. Instead, I'll share what I've done yesterday and today playing with the dataset named "Data.gc.ca Portal Catalogue".

I'm sure it will be fixed soon enough, but the file itself seemed to have character encoding problems. It is in the "latin1" character set, but has unicode characters crammed into it. This caused me hours of grief over two days to diagnose and fix it into something that displayed mostly correct. If you are a MySQL user (or any SQL), this basic utf-8 SQL file should save you the time of trying to import and convert it. data_gc_ca-all_datasets.zip (830KB zip file, 11.5MB expanded)

SQL format was only an intermediary for my plans. As a test, I wanted to load this simple dataset into Solr for searching. You can try the result here:

Government of Canada Open Data Solr Search* v0.1

http://www.grahamnott.com/data_gc_ca/

This does little more than the current Open Data search so far. In order of complexity and infeasibility, the improvements could include:

  • Search filtering by category, date, etc.
    • It would be similar to filtering auction in Ebay, by clicking links on the left side to refine your search
  • Advanced search
    • You could search only the French Description field, for example
  • Modifications to relevance searching (by popularity), spellchecking, etc.
    • Popular datasets would appear at the top of the search results, for example
  • Full text search over the entire contents of every dataset
    • You can search "Silvicultural", for example, and find dataset 645C42BA-5DC3-412E-AA3C-E37995354CB8, but searching "Clearcutting" (or "coupe à blanc") would also find this dataset, if the .csv file for the dataset was keyword indexed too

Being able to filter by category or date is not possible yet, since the fields do not exist. It would be nice to have  "Date Added" field to indicate when the dataset first appeared on Data.gc.ca, so that newly added datasets could be noticed easily. That would be in addition to adding "Subject Area",  "Creator", and many other fields.

Other issues and questions arise working with this data, which I'll save that for another time.

If you try it out the MySQL table or Solr search, please leave comments with your feedback, questions or errors.

Comments Off
February
6

Today I am testing the CiviCRM 4.1 beta 3 release on WordPress.  On WordPress, you may ask? For those who are already CiviCRM users, you may be as surprised as I was to see it becoming available as a WordPress plugin. Until recently, you could only run it on Drupal or Joomla. You can read the CiviCRM blog for more information about CiviCRM 4.1 beta 3.

I'm running on Windows 7 right now for kicking the tires, as it were. I had to do a couple small changes to get it going, so I am sharing them below:

1. Writable directories and files

On the WordPress Installation Guide for CiviCRM 4.1, you'll see the error mentioned "The user account used by your web-server needs to be granted write access to the following directory..." and a link to the CiviCRM forums. The suggested fix to change the read-only flag on the directories and files did not work for me, using the GUI or command line on Windows 7 (as administrator or not as administrator).

The workaround that allowed me to continue was to comment out the checking for writable directories in the /plugins/civicrm/civicrm/install/index.php lines 368 to 370:

        foreach ( $writableDirectories as $dir ) {
            /*$this->requireWriteable( CIVICRM_DIRECTORY_SEPARATOR . $dir,
                array("File permissions", "Is the $dir folder writeable?", null ),
                true );*/
        }

2. Fix an HTTP Error 500 Internal Server Error

I clicked the CiviCRM install button and everything seemed to be running. But, clicking "CiviCRM" on the sidebar, I got an HTTP 500 error. Uh oh. Nothing specific was mentioned in the web server logs as to why. After some trial and error, I fixed it by editing the civicrm.settings.php file. $civicrm_root and CIVICRM_TEMPLATE_COMPILEDIR variables. They had a mixture of forward slashes, backslashes and double backslashes in the paths. Once I changed them all to single forward slashes, I could reach CiviCRM. My example below:

global $civicrm_root;

$civicrm_root = 'D:/www/wordpress/public_html/wp-content/plugins/civicrm/civicrm/';
define( 'CIVICRM_TEMPLATE_COMPILEDIR',
'D:/www/wordpress/public_html/wp-content/plugins/files/civicrm/templates_c/' );

 

 

6
October
25

Blog search using Solr

Posted In: PHP, Solr, Web, Wordpress by graham

I'm speaking at WordCamp Victoria - January 14, 2012 - University of Victoria

Update January 16, 2012: Thanks to those who attended the presentation at Wordcamp. I have uploaded a revised set of presentation slides in PDF format that includes screen captures of the demonstration portions. If anyone has more questions, please leave comments on this post and I will try and answer them. blog-search-using-solr (PDF; 1.8 MB)

Update January 2012:  I am pleased to be presenting "Blog Search Using Solr" at Wordcamp Victoria 2012. As of today there may still be tickets available, so hurry and get yours now if you haven't already. See you there!

Original topic description: If you discover the default search in WordPress is too basic, you may benefit from installing a Solr for WordPress plugin. Enabling the power of the Lucene searching engine implemented in the Solr server sounds daunting, even to me. The search technology used by Digg, Netflix and Acquia (and others: http://wiki.apache.org/solr/PublicServers) could be yours. As a result of using it, you will discover whether the advantages outweigh the disadvantages. Features like faceted search, keyword highlighting, and more are just the start. Once you see the plugin in action, you will understand the untapped search potential it provides.

You will find this blog using Solr for WordPress right now, and it is still a work in progress (so hopefully it still works when you try it).

2
June
10

Each day, more computer users will free themselves from the PC*. Predictions are for the death of the PC, though in this weeks Apple WWDC 2011 keynote with the introduction of iCloud it has only been demoted and yet killed. I say the PC will never be dead, for some.

This freeing trend is twofold: taking the computing power and connectivity mobile, and liberating the data to external storage. Cloud is the fashionable term for this (cloud is a slight misnomer, but let's keep it simple). You can be partly mobile, and slightly cloudy, but some of us cannot completely unshackle ourselves from the PC platform. Why? Because the mobile devices and applications still can't do it all.

If you're a computer amateur or novice, ComputerLight may suit all your needs. If you don't use a computer at work, you may use one as an amateur for email, video calling, or a digital camera. Novices are those who use a computer at work, but only send emails and write reports. In short, you really don't do much computing, and you don't need a PC (except for the larger screen and full size keyboard benefits).

But there are some of us who are still tied, and probably will be forever tied, to the PC. The list of who this applies to is seemingly endless. Here is the question: for each task you do, if the answer is no, you'll never be free of the PC:

Q. Is there an app for that?

No app? Sorry, you're not free yet. You build apps? Then you'll never be free from the PC, and it will live on.

* The PC I refer to is any stationary computer inhabiting a home or office, and jealousy locks files into its boxy confines. This definition of a PC paints Windows and Mac machines with the same brush.


P.S. For all the readers of my blog, who I imagine I can count with the fingers on one hand, I plan to be a more active blogger in the future. To encourage me, leave a comment so the spammers are not alone in the comments approval list.

1
July
13

Tasked with importing thousands of records from a different product into CiviCRM software, I dragged my feet. I was not looking forward to the learning curve and time required to fully understand the import procedure and how to do it error free.

To test and clean the data without messing with the live CiviCRM installation, I created a separate Drupal+CiviCRM installation to test on.

If your data is unique or complex, importing will not be easy, regardless of the type of software. I was supplied a CSV export file, that I edited in Microsoft Excel and imported into CiviCRM 3.1.4. I could write an entire chapter on cleaning the data and planning the data import, so I will not enter into that here. In short, about the CiviCRM Import Contacts feature I found:

Likes:

  • Custom data fields, with multiple values for each field, can be imported in one step from a CSV file.
  • You can create a group for each set of imported data, which allows you to view or delete the data you just imported.
  • Relationships between and Organization and a Person can be made easily enough, using matching rules.

Dislikes:

  • The speed at which contact imports occur is slow.
  • The fragility of the import procedure.

Now, more about my dislikes. Regarding the import speed, I'm sure in subsequent versions of CiviCRM this will be improved. What CiviCRM lacks in import speed it makes up for in the vast amount of import features, such as the saving of import field mapping, creation of a group on import, the checking for data errors, and checking for duplicate records.

Regarding the fragility of the import procedure, there could be some trial and error required in order to get your data to import just right. I will address some specific problems I found that are not so straightforward to discover.

1. Strange characters

The result of this example of a "strange" character was that CiviCRM stopped the import of the file on the line directly before this row. Suggested fix: change the character in your data to something acceptable.

CiviCRM import does not like strange characters

2. Rows missing fields

For some reason some rows in the data I was given was missing cells at the end of rows. Excel will not give you any indication the cells are missing, and it took a trick to get Excel to recognize and save them. You will only see the missing fields if you open the CSV file in a text editor, like TextPad, you will see:

CiviCRM import does not like missing fields

The absence of commas at the end of the last two lines shows indications that fields are missing in your CSV file. To fix this, I created a new column in Excel named "Junk", and inserted a space into each cell in that column (use the fill down feature to do this quickly). When importing data in CiviCRM, simply have it ignore this last "Junk" column. In this way, every row in the CSV file will contain the same number of fields.

CiviCRM import, add a junk column

3. Importing Country and State values

This was a frustrating bug until I found a working method. Organizations that should be located in the United States were importing as "Iraq" or "Suriname". If a row being imported had a State or Province, then the Country value was being ignored, and the State of some other country was selected. I found to combat this error, you can place the Country column of your import file before the other address columns.

CiviCRM address columns

With these three odd peculiarities with the CiviCRM Import Contacts feature explained, hopefully this will help others who have similar errors. If it has helped you, or if you have some more tips, your comments are welcome.

1
May
21

Web fonts are back for good

Posted In: Web by graham

Dear Google,

I commend you for giving us the Google Font Directory at http://code.google.com/webfonts and the API. Ever since Netscape Navigator 4 have I anticipated using web fonts again, and that day has finally come.

XOXO - Graham

Comments Off
January
19

I like TELUS High Speed Turbo

Posted In: Misc by graham

The nice people at TELUS have upgraded my ASDL speed to their High Speed Turbo. I've only had it for a day, and I cannot complain getting 5Mb/second downloads. Kudos - now, please work on increasing the maximum upload speeds.

2
December
2

Nothing works completely. I should start by saying that CS Cart has impressed me greatly, and it is probably the best shopping cart software I've used. The administrator area is simple and powerful, and the templates (skins) are relatively easy to customize.

Getting shopping cart software to run propertly for a business in Canada is aways difficult. It's a combination of little effort on the developer's part to make it built to work, and what looks like no time spent on testing.

CS Cart gets full points for making it straightforward to create tax rates for different locations - PST for the province, GST for federal taxes, and HST for those three provinces that require it currently (the number of provinces will increase in 2010).

CS Cart does not get full marks for the Canada Post shipping module in CS Cart version 2.0.8. It almost does the minimum required, but it has a bug. I think the Canada Post module is relatively new, so hopefullly this will be improved in the future.

For more than an hour, I was troubleshooting why it would correctly estimate shipping from Canada to Canada, but it would never estimate from Canada to the USA. It always came back with the error message, "Unfortunately no shipping options are available for your location". After digging through the code, I realized that:

  1. No matter where the customer was located, the CS Cart always asked for domestic shipping products from Canada Post, and
  2. After an hour of troubleshooting and debugging the AJAX enabled shipping estimator, and trying to decipher the session data used by the software, I was no where close to finding the root cause in the CS Cart code.

Having given up looking for the optimal solution, I just hardcoded a fix. The code below was added to the /shippings/can.php file at line 87 (the first line is included here as a reference). It shortcircuits the bad logic to correctly state "if the customer is in the USA, get shipping quotes for the Xpresspost USA product".

	list($header, $result) = fn_http_request('POST', 'http://sellonline.canadapost.ca:30000', $request);

    // Short circuit the code used, because it continues
    //  to use a Canadian shipping code for USA customers
    if($location['country'] == 'US')
    {
      $code = '2030';
    }
Comments Off
September
30

Be a NaNoWriMo rebel

Posted In: Misc by graham

After a few days time of pondering, I decided today I will participate in the 2009 NaNoWriMo (National Novel Writing Month). But I am not going to write a novel. So that leaves non-fiction, and labels me a rebel!

NaNoWriMo Rebel 2009

Over the previous nine years of NaNoWriMo, enough people have rebelled to warrant a separate special area in the online forums. I would have been a rebel anyway, but it's reassuring to know I'll be joined by others.

I've chosen non-fiction because I want whatever is produced by the end of the month to be something I've always planned to write, even if it is a first draft.

My procrastination to write a book is just as common as the novelist participants; I've always planned to write a book, but it's never gotten off the ground. NaNoWriMo motivates me to attempt to reach the goal and win by writing 50,000 words.

Other things I like:

  • Anyone and everyone can be a NaNoWriMo winner
  • It only lasts one month
  • I'm writing with others, virtually
  • I can write anywhere at any time during November
  • It costs nothing out of pocket
  • The time spent is well spent
  • Even if you "lose", there are fringe benefits

Personally, some things I hope to gain:

  • Prove my idea is a good one (or disprove it)
  • Write a lengthy draft at the fastest speed possible
  • Force myself into a daily writing regime, hoping this will ignite my blogging and continuous writing

You may ask, what is my topic? Well, the conversation would probably go something like this,

You: "So you're a non-fiction NaNoWriMo rebel huh? What's your topic?
Me: "Oh... I see what you're after - get your own topic!"

If you've ever planned to write anything - novel or otherwise - consider NaNoWriMo this year.

1
September
15

Before:

After:

When you need to understand a new database schema, making a model like this is a big help. I still like DBDesigner4 fromfabFORCE.net to get these simple models done. Yes, I know there's a much newer GUI program, MySQL Workbench, that could do a better job. One day I'll have time to test you again, Workbench. But for me, DBDesigner4 is a simple yet powerful tool.

Comments Off