Good text I/O practice

The Ada standard libraries provides a general purpose text I/O package named Ada.Text_IO. This is good for small problems where efficiency is not an issue, and its parsing and formatting routines are excellent. One of the challenges for the compiler writers is that Ada.Text_IO is required to keep track of column, line and page numbers. Another challenge in the Ada.Text_IO design is that it requires “many” system calls or copying of the data being read/written from/to a file.

Below are some patterns for good text I/O practice. The tasks they solve are derived from Rosetta Code.

Read entire file

Ada.Direct_IO + Ada.Directories

Using Ada.Directories to first ask for the file size and then Ada.Direct_IO to read the whole file in one chunk:

with Ada.Directories,
     Ada.Direct_IO,
     Ada.Text_IO;
 
procedure Read_Entire_File is
   File_Name : String  := "read_entire_file.adb";
   File_Size : Natural := Natural (Ada.Directories.Size (File_Name));
   subtype File_String    is String (1 .. File_Size);
   package File_String_IO is new Ada.Direct_IO (File_String);
 
   File     : File_String_IO.File_Type;
   Contents : File_String;
begin
   File_String_IO.Open  (File, Mode => File_String_IO.In_File,
                               Name => File_Name);
   File_String_IO.Read  (File, Item => Contents);
   File_String_IO.Close (File);
 
   Ada.Text_IO.Put (Contents);
end Read_Entire_File;

This kind of solution is limited a bit by the fact that the GNAT implementation of Ada.Direct_IO first allocates a copy of the read object on the stack inside Ada.Direct_IO.Read. On Linux you can use the command

limit stacksize 1024M

to increase the available stack for your processes to 1Gb, which gives your program more freedom to use the stack for allocating objects.

This solution requires the Ada 2005 standard library (specificly package Ada.Directories) to work.

POSIX.Memory_Mapping

Mapping the whole file into the address space of your process and then overlaying the file with a String object.

with Ada.Text_IO,
     POSIX.IO,
     POSIX.Memory_Mapping,
     System.Storage_Elements;
 
procedure Read_Entire_File is
   use POSIX, POSIX.IO, POSIX.Memory_Mapping;
   use System.Storage_Elements;
 
   Text_File    : File_Descriptor;
   Text_Size    : System.Storage_Elements.Storage_Offset;
   Text_Address : System.Address;
begin
   Text_File := Open (Name => "read_entire_file.adb",
                      Mode => Read_Only);
   Text_Size := Storage_Offset (File_Size (Text_File));
   Text_Address := Map_Memory (Length     => Text_Size,
                               Protection => Allow_Read,
                               Mapping    => Map_Shared,
                               File       => Text_File,
                               Offset     => 0);
 
   declare
      Text : String (1 .. Natural (Text_Size));
      for Text'Address use Text_Address;
   begin
      Ada.Text_IO.Put (Text);
   end;
 
   Unmap_Memory (First  => Text_Address,
                 Length => Text_Size);
   Close (File => Text_File);
end Read_Entire_File;

This solution requires the POSIX Ada API (implemented as FLORIST or WPOSIX) to work. (It has not been tested with an Ada 83 compiler.)

Summary

Using POSIX.Memory_Mapping is slightly faster than using Ada.Direct_IO, but you only really get a benefit from using memory mapping if you don't actually need the whole file, as the operating system only will copy in the parts of the file actually accessed by the application.

Process text file

Task description. In other words, read a file into a variable (possibly only a part of the file at a time) and write it out to another file.

Line by line

This solution reads from the file one line at a time. One nice thing about this solution is that you easily can switch it to read from standard input - and possibly anything which your operating system considers a file.

with Ada.Command_Line, Ada.Text_IO; use Ada.Command_Line, Ada.Text_IO;
 
procedure Read_File_Line_By_Line is
   Read_From : constant String := "input.txt";
   Write_To  : constant String := "output.txt";
 
   Input, Output : File_Type;
begin
   begin
      Open (File => Input,
            Mode => In_File,
            Name => Read_From);
   exception
      when others =>
         Put_Line (Standard_Error,
                   "Can not open the file '" & Read_From & "'. Does it exist?");
         Set_Exit_Status (Failure);
         return;
   end;
 
   begin
      Create (File => Output,
              Mode => Out_File,
              Name => Write_To);
   exception
      when others =>
         Put_Line (Standard_Error,
                   "Can not create a file named '" & Write_To & "'.");
         Set_Exit_Status (Failure);
         return;
   end;
 
   loop
      declare
         Line : String := Get_Line (Input);
      begin
         -- You can process the contents of Line here.
         Put_Line (Output, Line);
      end;
   end loop;
exception
   when End_Error =>
      Close (Input);
      Close (Output);
end Read_File_Line_By_Line;

This solution requires the Ada 2005 standard library to work.

Notice how we avoid explicit checks for read access to the input file, creation/write access to the output file, as well as availablity of more data to process. Even if we put in explicit checks, we would still have to handle the same exceptions, as another application can change the state of the file system in parallel with this application, creating a race condition.

POSIX.Memory_Mapping

The POSIX.Memory_Mapping solution for reading an entire file into memory practically solves this task as well. Still, it has some limitations which may make it irrelevant for some purposes:

  • It only works for an actual file (i.e. one stored on a file system). Specifically it doesn't work for standard input, pipes and TCP connections.
  • It is not line oriented (i.e. you have to parse line-breaks yourself).
  • It requires an implementation of the POSIX Ada API (for example FLORIST or WPOSIX).

Navigation