The Components of the Text Import Druid

The Text Import druid consists of three panels with the middle panel differing according to the type of data structuring used.

The first panel allows users to configure the character encoding used by the file, to determine the character sequences used to separate lines, configure the type of structuring being used and select the lines of the file to import. The second column allows the user to define the separation strategy used for each value. For separated value files this involves defining the separating character sequences and the text indicating character. For fixed width files, this involves defining the width of each column. The third panel allows the user to select the columns to be included during the import and to select the format of the values in each column.

Users navigate the Text Import druid by clicking on the Forward button on each panel after they have configured the settings properly. The third panel contains a Finish which causes the file to be imported to a workbook using all the settings as they are configured.

9.4.2.1.  The first panel of the Text Import Druid.

The first panel of the Text Import Druid allows users to set the file encoding, to determine the character sequences used to separate lines, configure the type of structuring being used and select the lines of the file to import.

Figure 9-2 The first panel of the Text Import druid with the component areas labeled with callouts.

The different components of the first panel of the Text Import druid with each component labeled with a callout.

The purpose of each labeled component in Figure 9-2 is explained below:

The components of the first panel
1 - The file encoding selection menu.

This drop down menu provides a list of encoding schemes for the characters in the text file. By default, Gnumeric selects the encoding scheme used by the locale of the user. See Section 9.4.1.1 ― Character Encodings for more details.

2 - The line break character selector.

These three check boxes can be selected individually or together to define the sequences which will be interpreted as line break indicators. Generally, selecting all three boxes will produce the correct results.

The errors produced if the wrong combination of boxes is selected will include the entire file being placed on a single line, empty lines appearing between the lines of the file, or undefined symbols appearing at the beginning or end of almost every line. See Section 9.4.1.2 ― Line break delimiters for more details.

3 - The data structuring system selector.

These two push buttons allow the choice between the two different structuring schemes, data structured by placing a separating character between the data values and data organized in fixed width columns. Note that this choice will determine which panel will be shown as the second panel of the druid. See Section 9.4.1.3 ― Data Structuring Strategies for more details.

4 - The line range spinboxes.

These two spin buttons allow the user to select the start and end rows for the data import. The spin boxes can be used either by typing a new value in the text entry area where the numbers are displayed, or by using the mouse button to click on the up arrow to increase the number and the down arrow to decrease the number.

For instance, if the text file contained a large header area with meta information, this header could be excluded from the data imported to the Gnumeric worksheet by increasing the number of the starting, "From", line.

5 - The preview area.

This area displays a preview of the file as it will be interpreted when the the settings that are currently selected in this first panel are applied.

6 - The button area.

These four buttons allow the user to navigate the druid. The Help button should open the Gnumeric manual to this section. The Cancel button will dismiss the dialog and return the user to the worksheet. The Back button is disabled since this is the first panel of the druid and the Forward button will bring up the next panel in the druid.

9.4.2.2.  The second panel of the Text Import Druid used for separated data

The second panel of the Text Import Druid used for separated data allows the user to configure the character sequences used to separate the values in each row and to configure the text delimiting characters. Gnumeric, by default, guesses which characters are being used to separate values and pre-sets those characters. The user can, however, reconfigure these characters.

Figure 9-3 The second panel of the Text Import druid for separated data with the component areas labeled with callouts.

The different components of the second panel of the Text Import druid for separated data with each component labeled with a callout.

The purpose of each labeled component in Figure 9-3 is explained below:

The components of the second panel for structured data
1 - The separator definition area.

This are allows the user to define the characters used to separate data value fields within each row. The checkboxes can be pressed to add or remove characters from those treated as separators. Additionally, the 'custom' type allows the user to define either other single characters, or a particular character sequence used to separate values. The preview area in the panel will show the file processed with the rules which have already been applied.

Generally, this type of file structuring uses a single character to separate fields but it is possible to use either several different characters or to use a sequence of characters. For example, it would be possible to use the old telegraphic convention of separating phrases with the word 'STOP' by selecting the 'custom' separator type and entering the character sequence 'STOP' in the text field.

This area also includes a checkbox enabling two separator sequences that immediately follow one another, to be treated as a single separator. This option will only be useful where data is imported with one or more completely empty columns and no partially filled columns. If this option is checked and the data file has partially filled columns of data, the columns will be jumbled during the text import operation.

See Section 9.4.1.3 ― Data Structuring Strategies for more details.

2 - The text indicating character area.

Separated value files often additionally define a character used to indicate the start and end of a data element which should be considered a single text entry. This strategy allows the inclusion of text entries which include the value separator.

For example, a file which is structured as a comma separated value file, could use the double quotation mark to delimit text values and would then be able to include text values such as: 'Zoe, Mark, Sally'.

3 - The preview area.

This area displays a preview of the file as it will be interpreted when the the settings that are currently selected in the first and second panels are applied.

4 - The button area.

These four buttons allow the user to navigate the druid. The Help button should open the Gnumeric manual to this section. The Cancel button will dismiss the dialog and return the user to the worksheet. The Back button will take the user back to the first panel, without, however, changing the settings in this second panel. The Forward button will bring up the next panel in the druid.

9.4.2.3.  The second panel of the Text Import Druid used for fixed width data

The second panel of the Text Import Druid used for fixed width data allows the user to define the widths of each column to be imported. Gnumeric provides a mechanism to automatically guess the widths of the columns and allows the user, using the mouse, to define the widths of the columns.

Figure 9-4 The second panel of the Text Import druid for fixed width data with the component areas labeled with callouts.

The different components of the second panel of the Text Import druid for fixed width data with each component labeled with a callout.

The purpose of each labeled component in Figure 9-4 is explained below:

The components of the second panel for fixed width data
1 - The automatic column discovery button.

This left most button, named Auto Column Discovery, will cause Gnumeric to scan the file an attempt to assign the columns automatically. The example presented in Figure 9-4 shows one result after this button has been pressed: many of the columns were discovered automatically, but the second and third columns were misidentified. Nonetheless, the automatic mechanism provides a useful starting point. The definition of the columns can be refined using the methods described below.

2 - The column definition clearing button.

This right most button, named Clear, will clear all the column definitions and reset the file to a single column. This button should be used cautiously since there is no way to reverse its action and any carefully prepared column definition layout will be irretrievably lost.

3 - The preview and column width definition area.

This area acts as both a preview area and an area where users can define the columns widths.

As a preview area, this area displays a preview of the file as it will be interpreted when the the settings that are currently selected in this first panel are applied.

This area can also be used to define column widths. When the panel first appears, a single column will be defined. The automatic column discovery mechanism may split this single column into many more columns. The mouse can then be used to further divide columns or to join previously separate columns.

A new column can be defined by placing the mouse pointer where the column should start and double-clicking with the primary mouse button. This will split the column which used to contain this position and add a new column starting at this location.

To remove the definition of a column which already exists or to alter the ending position of a column, the context menu must be used. The context menu appears by clicking with one of the secondary mouse buttons. A column which has already been defined can be merged with the column on the left or right using the Delete and Merge Left or Delete and Merge right menu items. The size of a column can be increased by placing the mouse pointer inside the column area or header and using the Widen or Narrow menu items, respectively. Either of these will change the width of the column by changing the right hand end of the column.

The context menu can also be used to define new columns using the Split menu item but the double-click approach described above should be easier.

4 - The button area.

These four buttons allow the user to navigate the druid. The Help button should open the Gnumeric manual to this section. The Cancel button will dismiss the dialog and return the user to the worksheet. The Back button will take the user back to the first panel, without, however, changing the settings in this second panel. The Forward button will bring up the next panel in the druid.

9.4.2.4.  The third panel of the Text Import Druid

This panel allows users to select and format the columns to be imported to the Gnumeric workbook. The first button allows the exclusion of empty columns on either of the outer sides of the columns with data. The second button allows the user to define the locale used to interpret the values in the file. The remaining area allows the user to predefine the data format to be used for all the values in each column. This area also allows the users to select which columns in the file will be imported to the Gnumeric worksheet. Finally, this panel provides the Finish which is used to dismiss the dialog and import the file.

Figure 9-5 The third panel of the Text Import druid with the component areas labeled with callouts.

The different components of the third panel of the Text Import druid with each component labeled with a callout.

The purpose of each labeled component in Figure 9-5 is explained below:

The components of the third panel
1 - The trim of empty outer columns drop down list button.

This button provides a list allowing the user to select whether to trim any outer columns which are completely empty. The choices are to delete the columns on both sides, on neither side, or on one side only. This will only affect columns which have been previously defined but which contain no data values at all.

2 - Locale definition for import drop down menu button.

This button provides a list of locales which can be set. The chosen locale will affect how numeric values are interpreted when then are imported. For instance, the locale will define the character expected as the decimal separator which is the period character (.) in some locales, and the comma character (,) in others. These locales generally then use the other character as the spacer grouping the digits in thousands.

3 - The column data format selection list.

This list allows predetermining the format which Gnumeric will assign to each of the values in the columns selected below. Cell data formats are explained in Section 5.10 ― Formatting Cells.

To use this list, first, one or more columns must be selected in the preview area below, then, a data format in this list can be selected, and finally any details of the format can be configured. Number formats for instance allow the user to force numbers to contain fixed number of digits after the decimal point.

4 - The column selection, inclusion, and file preview area.

This area allows users to select columns which will be preformatted, to select which columns to include in the import and to preview the file. Each single column can be selected by clicking with the mouse pointer on the column header. Any single column can be excluded from the data imported to the Gnumeric worksheet by clicking in the checkbox in the column header to remove the check mark. The area also provides a preview of the data in the text file showing the effect of the with the current configuration.

5 - The button area.

These four buttons allow the user to navigate the druid. The Help button should open the Gnumeric manual to this section. The Cancel button will dismiss the dialog and return the user to the worksheet. The Back button will take the user back to the second panel, without, however, changing the settings in this third panel. The Finish button will dismiss the druid and cause the file to be imported into a new worksheet using the selected configuration parameters.