The LOLITA natural language processor is an example of one of the ever-increasing number of large-scale systems written entirely in a functional programming language. The system consists of over 47,000 lines of Haskell code (excluding comments) and is able to perform a wide range of tasks such as semantic and pragmatic analysis of text, information extraction and query analysis. The efficiency of such a system is critical; interactive tasks (such as query analysis) must ensure that the user is not inconvenienced by long pauses, and batch mode tasks (such as information extraction) must ensure that an adequate throughput can be achieved. For the past three years the profiling tools supplied with GHC and HBC have been used to analyse and reason about the complexity of the LOLITA system. There have been good results, however experience has shown that in a large system the profiling life-cycle is often too long to make detailed analysis possible, and the results are often misleading. In response to these problems a profiler has been developed which allows the complete set of program costs to be recorded in so-called cost-centre stacks. These program costs are then analysed using a post-processing tool to allow the developer to explore the costs of the program in ways that are either not possible with existing tools or would require repeated compilations and executions of the program. The modifications to the Glasgow Haskell compiler based on detailed cost semantics and an efficient implementation scheme are discussed. The results of using this new profiling tool in the analysis of a number of Haskell programs are also presented. The overheads of the scheme are discussed and the benefits of this new system are considered. An outline is also given of how this approach can be modified to assist with the tracing and debugging of programs.
ASJC Scopus subject areas