9.3.12

Difference between a record and a row

Google refine make a clear distinction between a row and a record. We will see what's the difference between the two and advantages to works in records mode.


A row is a single line of your project. The total number of rows is indicated at the top of your project. In this project we have 158 rows. 




A record is combination of one or multiple rows identifying a unique object and sharing the same first column. In this case Johannesburg and Pretoria share the Language Afrikaans. This is a common data / characteristics between the two rows. The total number of record is display at the top of your project when you switch to record mode. See the article to create records in your data set.


                         
Google refine use grey and white background to identified every records. Please also note note that the third column (after star and flag) count the number of records and not the number of rows. 

Google refine create a record based on blank cells in the first column of your project. For example, records can be generated when you transpose columns into rows or when you blank down cells while removing duplicate. The new row will share all the same data of the previous one, except for the transposed column.

Working in records mode can be useful in many cases:
  • When you are in record mode, the full record will be displayed when you facet a field if at least one of its rows match your facet criteria. This allow you to have a broader view of the picture and not only on the concerned rows
  • Filling down based on records is safer using the function  row.record.cells["data"].value[0] (see tutorial)
  • You can have your data matching a unique records split on several rows.