Spectrum 2000 Mindware Ltd.






CRAIN'S PETROPHYSICAL HANDBOOK
c. 1978 - 2008 E. R. (Ross) Crain, P.Eng.
Rocky Mountain House, Alberta Canada T4T 2A2
403-845-2527 email us
Please be fair to the author - pay your Shareware Fee HERE
and receive a copy of Crain’s Petrophysical Handbook on CD-ROM at no extra cost.
Updated 4 July 2005

CHAPTER FOURTEEN: KNOWLEDGE BASED SPREADSHEETS FOR LOG ANALYSIS

Table of Contents
14.01 Introduction to this Chapter
14.02 Why Consider Spreadsheets for Log Analysis
14.03 What Is An Electronic Spreadsheet
14.04 What Is An Expert System
14.05 Designing An Expert System On A Spreadsheet
14.06 Advantages Of a Spreadsheet For AI Applications
14.07 Disadvantages Of a Spreadsheet For AI Applications
14.08 Software And Hardware Requirements
14.09 Examples Of Spreadsheet Analysis
14.10 In Conclusion
14.11 Addendum: Spreadsheet Analysis of Logs, 1985
14.12 Bibliagraphy for Chapter Fourteen
14.13 Exercises for Chapter Fourteen

Software review in Geobyte Magazine

Download META/LOG shareware spreadsheets

Continue to Chapter Fifteen

Publication History: The main part of this Chapter was originally presented at 12th Formation Evaluation Symposium, Canadian Well Logging Society, September, 1989 as "A Knowledge Based Spreadsheet System To Reduce Complexity in Log Analysis".

This paper was written a long time ago and many of the limitations of spreadsheet software and desktop computers mentioned in this Chapter have disappeared. Most of the comments on the virtues of spreadsheet software as compared to larger standalone log analysis packages still apply. I continue to use this software on most jobs - it is quite effective and very economical.

An earlier paper describing the first use of "Spreadsheet Analysis of Logs" was presented at the 10th Formation Evaluation Symposium, Canadian Well Logging Society, September, 1985. This earlier paper is truly obsolete, as none of the software products mentioned is currently available. It formed Chapter Fourteen of The Log Analysis Handbook published by Pennwell in 1986. It is included here at the end of this Chapter as an Addendum so that the reader can see the evolution of the spreadsheet concept as a petrophysical tool.

The current version of this spreadsheet, META/LOG PROFESSIONAL, runs in Lotus 1-2-3 for Windows. Other versions run in Excel for Windows. All versions are available for download for a small shareware fee. You can also download a software review "As Easy As..." META/LOG Software Review by R. Y. Elphick. Geobyte, Fall 1989.

This software has been used for many projects, from single wells on every continent to the entire Burgan oilfield in Kuwait (770 wells, 1500 feet per well). The math has been tuned on over 10,000 wells from shaly sands to fractured carbonates, from tar sands to oil and gas in granite reservoirs. View the Project List for a short summary of the more interesting jobs. Samples of the output can be found throughout this website.

CHAPTER FOURTEEN: KNOWLEDGE BASED SPREADSHEETS FOR LOG ANALYSIS

14.01 Introduction to this Chapter
A knowledge based expert system for log analysis, called META/LOG, based on an electronic spreadsheet (Lotus 1-2-3) is presented. The definition of spreadsheet and expert system terms, advantages and limitations of spreadsheets in AI applications, and some typical examples of knowledge based spreadsheet analysis are discussed.

The system incorporates about 140 rules related to appropriate log analysis methods, based on the rock type, fluid type, and available log data. In addition, the rules choose the best initial log analysis parameters for the methods selected, derived from a knowledge base containing over 350 parameters and a questionnaire which must be filled out by the user. A total of 30 different log analysis algorithms, offering 6700 uniquely different analyses, demonstrates why an expert system is helpful in reducing the complexity of modern log analysis.

In addition, core analysis, DST analysis, exploration economics, and well history modules are integrated into the same package, making it easy to calibrate results.

14.02 Why Consider Spreadsheets for Log Analysis?
The tools of the log analysis trade have evolved rapidly over the last 20 years, from charts and nomographs, to slide rules, to programmable calculators, to desktop microcomputers. Each method had or has its drawbacks. Charts are not very accurate or repeatable. Slide rules are difficult for some people to use and don't add or subtract very well.

Programmable calculators are easy to use but not too easy to program. You can't re-run your data with new assumptions without re-keying all the data. And you have to write down the answers, get them typed into a report, and..... Let's face it, these methods are as obsolete as the ES log.

Mini computers (or terminals to larger computers) may or may not be easy to use, but are almost invariably programmed by others, so custom analysis is difficult. And they're expensive, so only a small fraction of potential users have acsess to them. Because of their cost, companies are reluctant to discard obsolete systems and replace them with better ones, leaving users with useless systems or ancient technology.

Desktop micro computer systems, pioneered by the author and D.W. Curwen in 1976 (Ref 1) went some distance in bringing log analysis power to the people who needed it. Such systems have evolved into very powerful fourth generation languages (Ref 2) and can be obtained from a number of vendors.

Another solution, namely electronic spreadsheets, has been available for less than ten years, but only recently have computers and software been powerful enough, large enough, fast enough, and cheap enough to do log analysis. These programs are typified by products like Smart, Supercalc, Context MBA, and Lotus 1-2-3. A spreadsheet for log analysis, using Lotus 1-2-3 was described by the author in 1985 (Ref 3). Visicalc, the original electronic spreadsheet, died a financial death a few years ago.

However, most log analysis programs, including the spreadsheet described above, are really pretty dumb. They expect the analyst to know which method, parameters, and parameter values are best for any situation. For the novice or casual log analyst, such knowledge is too much to expect. This is where artificial intelligence, or expert systems programs play a role. They provide information to the user, gleaned from experts in the field, which overcomes the dumb-computer syndrome.

Successful well-log analysis is an acquired skill which is very dependent upon the experience of the analyst. The knowledge which an analyst brings to bear on a problem is very specific to the region being analysed, and therefore a considerable amount of local information is required for successful analysis. Much of this knowledge is available from published literature and from archives of previous work.

This information is termed the knowledge base or fact base of a log analysis expert system. Unfortunately casual users and experts tend to forget this knowledge and are forced to look it up or recreate it for each job. That's why it is useful to embed the knowledge within the program, in the form of a data base that can be updated as new knowledge is gained. Hard coded data sets, found in most programs, are not very easy to update.

A further step involves extracting analysis rules and methodology from an expert in log analysis. Rules are usually of three types: usage rules which dictate which method is the best choice for a given data set in a given area, parameter selection rules, and "what if?" or iterative rules for trying alternative methods or assumptions if results are not acceptable on the first attempt. This information is termed the rule base of an expert system.

An expert system enables a geologist or engineer to perform complex well-log analyses which in the past, could only be done with the assistance of a log analysis expert. In addition, any interpretation, whether by an expert or not, would require less work to provide more complete analysis results. Further, it allows experts to share and consolidate their knowledge and experience, for use by all analysts with access to the system.

Systems of this type were described by the author in 1985 (Ref 4). Although some of the systems are commercially available, their cost, complexity, and immaturity has restricted their use to date.

This paper presents an expert system for log analysis, written in Lotus 1-2-3, which is inexpensive, simple, and well tested. While 1-2-3 may seem like an inappropriate shell for an expert system, its ease of use and fourth generation features provided many advantages not found in other low cost shell.

14.03 What Is An Electronic Spreadsheet?
An electronic spreadsheet is a computerized replacement for the pencil and columnar pad so familiar to log analysts who do hand calculations at the well site or in the office. One virtue of the electronic spreadsheet is that it can be much larger than a sheet of paper, often allowing up to 256 columns of data (with up to 72 characters per column) by 8000 rows long. The screen or monitor of the computer is a window on this large array of data. Definitions of spreadsheet terms are shown in Figure 14.01.

Each row is assigned a row number and each column is headed by a column letter. The intersection of a row and column is termed a cell. Cells are identified by their row and column designation (e.g. C14). Movement around the portion of spreadsheet seen on the screen is performed by moving a cursor using the keyboard, lightpen, or mouse attached to the computer.

Cells can contain four kinds of data:

1. text or labels
2. numbers (raw data or answers)
3. formulae or algorithms
4. spreadsheet functions

Because a cell can contain data, a formula to compute results, or a reference to another cell which contains, the result, a cell thus looks like a simplified frame in a frame-based data structure. A cell containing text or a value looks like an element in a data base, and can be used either as an element or a frame. A row or column, or an array of rows and columns, looks like a record (a list of elements) in a data base and can be manipulated in the same way.

Mathematical formulae in spreadsheets show relationships between cells (e.g. C1=+A1+B1). If the relationship in many cells is similar (e.g. C1=+A1+B1, C2=+A2+B2, etc.) then each individual relationship must be described. There are spreadsheet commands which make it easy to copy a cell to other locations, preserving the relationships between cells. Many other commands allow the program designer to move or erase cells, insert and delete rows and columns, format and justify data and text, and many other housekeeping tasks.

A cell can contain very complex algorithms, which makes them eminently suitable for log analysis. Up to 240 characters can be used, and in rare cases, this limit has been reached.

Spreadsheet functions are shortcuts which help the user perform common mathematical computations, such as SUM, AVERAGE, MINIMUM, MAXIMUM, IF...THEN, and PRESENT VALUE. Some of these are not even available in conventional programming languages but must be coded uniquely each time they are needed.

The act of building a log analysis model with a spreadsheet defines all the mathematical relationships in the model, as well as the raw data and analysis parameters. A trained user can edit formulae, insert rows or columns, fix typographical errors, enter new data, and recompute results without changing a Basic or Fortran program. The spreadsheet is the program and the user is the programmer!

You can also play "what if?". You could vary the water resistivity, matrix or fluid parameters, or shale values and see the change in the log analysis results in a few seconds. Computations can be revised immediately after entering a change, or after entering all changes, as decided by the user.

The original spreadsheet, after testing and debugging, can be used immediately, or saved as a standard analysis package. Standard spreadsheets can be customized to fit individual problems when they are used. Managers or professionals may customize spreadsheets for their own use, or for use by other professional or technical staff in other departments.

Results and data are saved on disc by saving the entire spreadsheet or by extracting only desired portions (to conserve space). Formatted printouts are provided by the WYSIWYG (What You See Is What You Get) process. By using the spreadsheet macro programming functions, a full report with attractive layouts is a simple task for the user, although not so simple for the programmer.

Standard crossplots and histograms are created by the graph commands of the spreadsheet. Regression and sorting are also useful commands.

Spreadsheet software is the most widely used application for microcomputers. Over 5,000,000 copies of Lotus 1-2-3 and its clones have been sold and 1-2-3 has been number one on the software best seller list (the programming world's Top 40) every week for the last 150 weeks. It is also universally portable, being available on nearly every desktop computer now sold. Versions even exist for some large mainframe computers. Thus the allure of these features was too much to resist. An expert system for log analysis just had to be built on this fantastic vehicle.

14.04 What Is An Expert System?
Expert systems and artificial intelligence are not new concepts. Researchers have worked to develop artificial intelligence since the early 1950's for a number of reasons. One is to help understand the human thinking process by modelling it with computers. Another is to make better computer hardware by modelling the computer more closely after the human brain. More achievable goals, such as making computers act more human or easier for humans to use, are also part of the AI spectrum, as are robotics and pattern recognition or artificial vision. Natural language understanding, automatic translation, and automatic computer programming are other aspects of artificial intelligence.

Figures 14.01, 14.02, and 14.03

In the petroleum industry, well log analysis, property evaluation, reservoir simulation, drilling operations, and geologic interpretation have been attacked with AI techniques.

The distinctions between conventional programming, intelligent programming, and artificial intelligence are not hard and fast. Conventional programming uses procedural languages such as Basic or Fortran to create sequential code to solve explicitly stated problems. Intelligent programming goes one step further. Here data bases are used to hold much of what would otherwise be hard code. As a result, the system is much more flexible, and program sequence or content can be modified at will by the user, as can the knowledge contained in the numeric and algorithmic sections of the data base.

Artificial intelligence software uses a process called symbolic processing instead of linear processing of variables in sequence. Although conventional computing uses symbols (variables) in describing the program, the symbols are not really manipulated by the operating system to create new symbols, relationships, or meanings. In artificial intelligence, new relationships between symbols may be found, if they exist, that were not explicitly stated by the programmer. This is usually called an undocumented feature or bug in conventional software.

In addition, symbols without values can be propagated through the relationships until such time as values become available, again without help from the programmer. Anyone who has had a divide by zero error while testing a program will appreciate this feature.

One of the most economically attractive facets of AI is expert systems development. Expert systems apply reasoning and problem solving techinques to knowledge about a specific problem domain in order to simulate the application of human expertise. Expert systems depend on knowledge about the particular specialty or domain in which they are designed to operate. The knowledge is provided by a human expert during the design and implementation stage, hence the name expert system. Such programs most often operate as a intelligent assistant or advisor to a human user. Modern usage invokes the phrase knowledge based system to mean the same as expert system.

Thus, an expert system consists of:

1. A knowledge base of domain facts and heuristics (rules) associated with the problem,

2. An inference procedure or control structure for utilizing the knowledge base in the solution of the problem, often called an inference engine,

3. A blackboard, or global data base, for keeping track of the problem status, the input data for the particular problem, and the relevant history of what has been done so far.

Figure 14.02 shows a block diagram of an idealized expert system, and definitions of important terms are shown in Figure 14.03.

The facts consist of a body of information that is widely shared, publicly available, and generally agreed upon by expertsd in a field. The heuristics are mostly private, little discussed rules of good judgement that characterize expert level decision making in the field. The rules may be difficult for the expert to verbalize, and hence are difficult to elicit or share. Some facts and/or heuristics may be proprietory to the user or user's organization, and are thus not shareable outside the organization.

As an example, the facts in an expert log analysis system are the known properties of rocks and fluids. The heuristics include mathematical rules such as Archie's water saturation equation, as well as usage rules which describe when this equation might be used in achieving the desired results.

Usage rules are based on the availability of log data and constraints concerning hole condition, borehole and formation fluid type, rock type, and tool or algorithm resolution. They are intended to provide the best initial set of algorithms to use.

The most popular approach to representing the domain knowledge needed for an expert system is by production rules, also referred to as SITUATION-ACTION rules or IF-THEN rules. A typical rule for a log analysis system might be:

IF matrix density is greater than sandstone matrix density
AND lithology is described as shaly sand
THEN suspect a heavy mineral OR cementing agent
OR suspect inadequate shale corrections
OR suspect poor log calibrations

Most conventional log analysis programs contain checks and balances of this type, coded in Basic or Fortran, with appropriate action being dictated by user defined logic switches. The virtue of an expert system knowledge base is that the expert can modify this rule set more easily than a hard coded program.

There are three different ways to use an expert system, in contrast to the single mode (getting answers to problems) characteristic of the more familiar type of computing. These are:

1. Getting answers to problems -- user as client,
2. Improving or increasing the system's knowledge -- user as tutor,
3. Harvesting the knowledge base for human use -- user as pupil.

Users of an expert system in mode 2 are known as domain specialists or experts. Those in mode 3 would be novices or casual users. Anyone might use the system in the usual mode 1 context.

An expert usually has many judgemental or empirical rules, for which there is incomplete support from the available evidence. In such cases, one approach is to attach numerical values (certainty factors) to each rule to indicate the degree of certainty associated with that rule. In expert system operation, these certainty values are combined with each other and the certainty of the problem data, to arrive at a certainty value for the final solution. Fuzzy set theory, based on possibilities can also be utilized.

An expert system can act as the perfect memory, over time, of the knowledge accumulated by many specialists of diverse experience. Hence, it can and does ultimately attain a level of consultant expertise exceeding that of any single one of its "tutors". There are not yet many examples of expert systems whose performance consistently surpasses that of an expert. There are even fewer examples of expert systems that use knowledge from a group of experts and integrate it effectively. However, the promise is there.

14.05 Designing an Expert System on a Spreadsheet
To demonstrate the use of expert systems concepts in a spreadsheet, we will use a working log analysis program called META/LOG. It contains the usual shale volume, porosity, lithology, water saturation, permeability, and productivity algorithms, and places for raw log data, analysis parameters, and results. In addition, other useful well data, such as DST analysis, well history, core analysis, and exploration economics are included. The functions of the system are summarized in Figure 14.04.

Developing the three essential ingredients of an expert system, the knowledge base, inference engine, and global data base, on a spreadsheet is really quite easy.

Figure 14.04: Features of META/LOG spreadsheet

A key component is a questionnaire which is filled out by the user. The questionnaire is a knowledge acquisition vehicle, designed to elicit facts known by the user that are not implicit in the log data, such as the local geologic setting and personal analysis preferences. This provides the system with necessary data about log availability and quality, rock type and texture, fluid type and viscosity, shale properties, and water resistivity data. The questionnaire used in META/LOG is shown in Figure 14.05.

The questionnaire is interactive; for example after entering water zone resistivity, porosity, depth, and temperature relationships, the actual formation water resistivity and temperature that will be used is displayed. If these values appear unreasonable, alternate sources can be derived within the questionnaire.

Figure 14.05: Knowledge Base Questionnaire

The main function of the questionnaire is to provide sufficient information to the expert system, using an English language interface, so that it can make choices that would otherwise be left for the user to make. After all, who remembers the Tpo value for water, the UMA for dolomite, or the PHI*SW for fine vuggy carbonate. Indeed, who even knows what the abbreviations mean.

The frames (cells) containing local parameters, world parameters, assertions, and mathematical algorithms constitute the knowledge or fact base. Basic parameters for sonic, density, neutron, and photo electric effect are stored for about 22 rock mixtures commonly found in productive oil and gas regions. Additional data for porosity, saturation, lithology, permeability, productivity, and recoverable reserves, all of which depend on the rock and fluid description, are also kept here. The fact base is easily accessible and values can be changed by an experienced analyst to reflect his personal belief or local knowledge.

The algorithms for log analysis are also in open code and can be edited by an experienced analyst. Thirty of the most popular algorithms are coded. With a complete log data set, more than 6700 unique log analyses are possible. The rule base is designed to show which individual algorithms are solvable and which is the "best" set. Usually the method that uses the most data is considered "best" unless a personal preference over-rides this choice. A log analysis expert would easily choose the best method for a given problem, but a novice would not. The pre-coded algorithms in META/LOG are:

SHALE POROSITY SATURATION PERMABILITY LITHOLOGY
         
GR SONIC ARCHIE WYLIE NONE
CLAVIER DENSITY SIMENDOUX TIMUR DENS/2MINERAL
SP NEUTRON DUALWATER COATES PE/2MINERAL
XPLOT SH SAND PHIxSW POROSITY S-D/2MINERAL
RESD DUALWATER EPT   N-D/2MINERAL
MINIMUM COMPLEX TDT   M-N/3MINERAL
  PE/DENS     PE-D/3MINERAL
  PHIMAX      

The actual code for some of these methods is shown in Figure 14.06, to illustrate both simple and complex algorithms. In addition, the frame-like nature of the cell contents is clearly evident.

Figure 14.06: Typical Algorithms

Display and formatting information is embedded in the frame as well as its name (the cell location). The distinction between rules and algorithms is blurry, as demonstrated by the last algorithm shown; in reality it is a rule to choose the "best" porosity algorithm.

The rule base, coded as IF...THEN statements, is kept in another group of records (cells). The rules determine the feasible log analysis method, based on the available data, borehole conditions, and the expected rock and fluid type. Additional rules determine the appropriate parameters for the selected method and expected rock and fluid type. Some typical rules from META/LOG are shown in Figure 14.07. Note that the frame contains two parts - the actual operating code for the rule and English explanation of what the rule does. Note that the macro languages of today are far more readable and self-explanatory than those of 1989.

The algorithms and rules operate on data stored in three areas - the knowledge base containing relatively static facts, the raw data which contains information particular to the current example, and the current parameter/ options array.

Figure 14.07: Typical Rules

This latter data array serves as the blackboard, or global data base, of the system. As rules are fired, they check the current status of the blackboard and update it in accordance with the instructions within the rule. An extract from the blackboard is shown in Figure 14.08.

The distinction between facts or parameters and rules is also blurry. Facts are merely "terminal rules", that is, rules that do not lead to further rules. A distinction is made here because the appearance of a fact in the frame is so much more concise than a normal rule. For example, the rule "If Lithology is Glauconitic Sandstone, THEN matrix density is 2740 Kg/m3" takes a lot more effort and space than the same value placed in an indexed array or lookup table. Both, however, accomplish the same function within this program and facts may be coded in either form.


Figure 14.08: Parameter/Options Blackboard

The instructions in the rule base are executed by a 1-2-3 macro, which operates as the inference engine. When rules are written appropriately to review all the related elements on the blackboard, they are order independent. Some order is imposed voluntarily for simplicity, eliminating the need for the inference engine to search both ways in the rule base, although the Lotus macro language could handle this situation if needed. For example, rules about log availability and quality are executed before rock and fluid rules. There is no point in executing rules which relate to methods for which there is no data.

Rules can be moved or copied with the Lotus/Move or/Copy commands, even though data in the fact base cannot unless you are willing to re-write the affected rules. This is caused by the fact that Lotus does not update cell addresses in macros - a serious flaw in the use of a spreadsheet for a rule based system. Moving rules allows you to insert a new rule that is order sensitive. If you do add or change rules, you may also need to change or add to the Questionnaire.

An audit trail of the reasoning used by each rule that has been invoked by the inference engine is displayed on the screen as each rule is fired. Review of this list allows the user to verify that answers to the questionnaire were correct, and that the methods and parameters chosen by the expert system are reasonable. Only the rules that were fired are explained in the reasoning. An example is shown in Figure 14.09. No further interogation of the reasoning is possible in the current program. However, by changing answers to the questionnaire, differences in the reasoning become apparent and act as an excellent training mechanism.

Finally, a manual over-ride mode to fine tune the parameters and methods suggested by the system, or to bypass the system altogether, is available. By manually editing parameters and option switches, the user can impose his own beliefs independently of the expert system. This step is normal, since log analysis is often an iterative process. Rules for iterative enhancement of results are planned for a future release.

Figure 14.09: Results of Expert System Reasoning

14.06 Advantages of a Spreadsheet for AI Applications
The advantages of a spreadsheet can be stated simply - speed, low cost, ease of use, familiarity, and limited programming skills required. The advantages of an expert system are the ease of use by novices, consistency between jobs and faster results with less chance for errors in the parameters.

All the features of a simple expert system can be created and tested by a log analysis specialist, and used in a production environment by relatively inexperienced log alaysts. No special training for the user is required. In fact, the questionnaire format is a very natural interface between the user and the expert system and could be invoked in any program.

A typical data set of ten depth points and five data curves can be entered into an existing log analysis spreadsheet in about two minutes by even a poor typist. The user must, of course, know how to pick values off a log. The questionnaire takes another two minutes to complete. The calculations take about one tenth to two minutes, depending the CPU clock rate and operating environment. Three or four crossplots can be viewed and results can be printed in less than two minutes. Total elapsed time is less than ten minutes. Each recomputation with a different parameter takes less than a minute. Large data arrays - say fifty depth points - can be entered, computed, and printed in about 20 minutes using digital data from a data base or standalone digitizer program.

It is important to calibrate log to core data results and this is very easily accomplished. By adjusting shale, porosity, saturation, and permeability parameters and recomputing, a reasonable match, or a reason for the mismatch can be found. Cores often do not cover the whole pay interval, so, after the calibration step, one must be sure to revise the depths on the hydrocarbon summary to cover all the pay. If many parameters need adjustment, the elapsed time varies from 20 minutes to 2 hours depending on the severity of the problem.

Progressive engineers and geologists, familiar with microcomputers and spreadsheets, can learn to use such a package in less than an hour of practice. Modifying a spreadsheet or creating new ones for specialized analysis should take only one day's practice, and thereafter a few minutes to an hour may be needed to tune the spreadsheet algorithms, rules, or facts to a particular new problem. People with limited knowledge may need as much as five days log analysis traianing plus three days of computer and spreadsheet training before embarking on real work with the system.

Because the screen layout, printed results, and data structure are one and the same, the spreadsheet contents become familiar quickly. The data sheet is always available for viewing, compared to conventional log analysis packages in which the data structure is invisible to the user.

Other factors, such as built in data management, file storing and retrieving, graphics, simple and friendly keystroke sequences, make spreadsheets more attractive than writing or using Basic or Fortran programs. It is certainly easier to use than most PC based expert system shells, which are usually not designed for mathematically complex problems like log analysis.


14.07 Disadvantages of a Spreadsheet for AI Applications

Clearly, a spreadsheet is not an AI shell in the usual sense. However, both forward and backward chaining are possible, as they are in most programming languages. Data representation is limited to whatever a cell can hold but this is incredibly flexible. Frame based concepts are easily supported, but others are less easy to formulate. This rigidity in format and concept would not appeal to all AI practitioners.

Spreadsheets are very memory hungry and like a lot of disc space. There are workarounds for this, which must be used whenever many large spreadsheets are needed. True symbolic processing and uncertainty handling are not supported and would be difficult to program in the Lotus macro language.

As a test bed for rules that reduce the complexity of log analysis, the spreadsheet is a great prototyping tool. This makes it easier to test concepts in a very inexpensive vehicle and transfer successful concepts to more elaborate models. If an appropriate shell can be found, that includes all the data management and manipulation features of the spreadsheet, it may be a better choice. However, most shells are strong on data representation and weak on manipulating data.

Graphics presentation in spreadsheet programs is adequate for most crossplots, but a true 3 or 4 track depth plot is not possible using Lotus commands. A macro, however, can be written to drive a pen plotter or printer/plotter emulator from within the spreadsheet.

Adequate depth plots can be made with Lotus commands, if a little care is taken to scale and position each track by using the Lotus Printgraph program and a pen plotter. Another option is to write an ASCII file of relults with the Lotus print commands after which the data may be picked up by a commercially available plot program for final display.

Another major complaint is the difficulty of entering large amounts of data to the spreadsheet. Although the primary use of this expert system is for hand picked data, as a replacement for the hand calculator or chart book, some people prefer continuous digitized data. A macro could be written to drop data directly into the spreadsheet cells, but this has not been attempted yet.

A number of commercially available digitizing programs are available for use with META/LOG. These run in BASIC or compiled BASIC, not in Lotus 1-2-3. They create an ASCII file of each log curve, or a combined ASCII file of all curves. Lotus 1-2-3 allows entry of this data in a three step process.

First the ASCII file is loaded into an empty spreadsheet using the Lotus 1-2-3 /File Import command. The ASCII file must have a .PRN suffix on its file name for Lotus to recognize it as a legal file. If the file name suffix is not .PRN, the MS-DOS Rename command is used to change the file name appropriately. Then the lines of data in the file are separated into their columns using the /Data Parse command. Columns of data may need to be interchanged to match the column order of the META/LOG data array. Data can be edited and depth shifted at this stage prior to combining into META/LOG. Finally, this file is inserted into the META/LOG spreadsheet by using the /File Combine command.

Log data on digital tape can be transformed to ASCII files by a number of commercially available programs. Core data is also available on floppy disc from some core service companies. These companies provide utility programs to convert their files to ASCII format. This file is then loaded as described above for digitized log data.

Some service bureaus offer log and core data over the telephone, email, or nretwork connections from their databases. A suitable commercial data communication program, such as KERMIT, can be used to dial up the database and retrieve the desired data. The retrieval will create an ASCII file containing log or core data.

All this seems like a lot of work, but in fact is not much different than conventional log analysis systems. These problems of data entry and display are not unique to spreadsheet log analysis systems, but are more obvious because they occur outside the spreadsheet instead of being integrated into the system. The Lotus spreadsheet also supports all the usual read and write instructions to files or peripherals, so there is no reason that normal plotter and digitizing programs cannot be written within the macro language.

A final note of caution: some spreadsheets do not support forward references, that is a reference to a cell that is below or to the left of the current cell. These spreadsheets are useless for a system such as the one described here and in fact are useless for all but the most trivial financial applications.

14.08 Software and Hardware Requirements (1989)
The following section was prepared in 1989. Any current desktop computer or work station far exceeds these minimum requirements. The section is retained in its original form to give you an idea of how far we have come since 1979.

To run META/LOG you need a computer that can run Lotus 1-2-3 (Ver 2.01 or higher). Usually this is an IBM-PC or an equivalent computer using PC-DOS 2.0 or MS-DOS 2.0 or higher. You probably have one in your office now.

Recommended:

  • IBM-PC/XT or AT or compatible
  • 640 kbytes of memory (RAM)
  • colour or monochrome monitor
  • EGA or VGA graphics
  • 1 or 2 floppy disc drives (5-1/4" or 3-1/2")
  • 20 Mbyte hard disc
  • parallel and serial ports
  • enhanced keyboard with separate cursor and numeric keypads
  • MS-DOS or PC-DOS 2.0 or higher
  • Lotus 1-2-3 2.01 or higher
  • NLQ printer (serial or parallel)

Optional

  • DOS 3.0 or higher
  • LIM expanded memory driver
  • 2 or 4 Mbytes expanded memory
  • laser printer with Lotus driver
  • pen plotter with Lotus driver
  • 40 or 60 Mbyte hard disc
  • numeric co-processor
  • 286 or 386 CPU, high clock rate
  • memory resident print buffer
  • word processor package
  • data communication package
  • applications manager package

You should have a hard disc, because the spreadsheet files are fairly large and floppy discs are awkward. You need at least one 5-1/4 inch or 3-1/2 inch floppy drive to read the original program discs. A good dot matrix or ink jet printer is essential.

If you want to compute more than about 50 depth points in the log analysis program, you will need the extended memory capability provided by MS-DOS 3.0 along with Lotus-Intel-Microsoft compatible extended memory management software and up to 4 megabytes of extra memory.

A typical system will cost between $2000.00 and $7000.00 depending on features and performance. Higher priced computers will load and process data faster but not better. The added friendliness of speed makes it easier to get through the day but won't change your answers.

14.09 Examples of Spreadsheet Analysis
The first example, shown in Figure 14.10, is from a well in which the Halfway sand was analyzed. The Halfway is characterized by anhydrite and dolomite in a shaly sand environment. Analysis requires a complex lithology model. The expert system performs an adequate first pass analysis. Because core data was entered, log analysis permeability is calibrated by a best fit line to core porosity.


Figure 14.10: Sample Raw Data and Results


Figure 14.11: Sample Summary Table and Automatic Report

On the summary page, Figure 14.11, the log and core data match quite well. Moreover, estimated initial productivity compares favourably to the well's un-stimulated initial deliverability. Note that the text report is a useable final product, ready for the well file or boardroom completion/abandonment meeting without the errors and time lag of a typist.

In the absence of core data, the expert system would have underestimated permeability, and hence initial productivity, by 50%. The analyst would have to recognize this problem by using local knowledge and either manually re-compute with a different permeability parameter (CPERM) or change the suggested values of CPERM in the knowledge base. It should be noted that the productivity estimates embedded in META/LOG are not suitable for fractured, stimulated or dual porosity reservoirs. All algorithms used in META/LOG are described in detail in Ref 5.

Some typical crossplots, using Lotus graphics, are displayed in Figure 14.12. Plots are graphic dumps of the screen contents to the printer, using Lotus PrintGraph and an HP LaserJet printer. More attractive but slower plots can be prepared using the utility programs supplied by Lotus 1-2-3 to drive a pen style colour plotter or colour printer.


Figure 14.12 : Typical Crossplots

Cash flow, based on a current price and costs estimate, is shown in Figure 14.13. There is no doubt that this well is economically viable and that more similar wells should be drilled if possible. One advantage of tying economics to the log analysis is that it gives a much better answer to the question "Is the well any good?" than does porosity, saturation or net pay.


Figure 14.13: Typical Cash Flow Analysis

The second example is a radioactive sand (Keg River/Granite Wash)/ Results are shown in Figure 14.14. The analysis model used the Uma/DENSma crossplot for lithology and calculated porosity from the density log with the mineral mixture determining the matrix density at each point. The depth plot is the best that can be achieved with Lotus and is suitable for quicklook applications. By telling the system that radioactive sands were present, the gamma ray was not used for shale volume but was derived from density neutron separation.


Figure 14.14: Spreadsheet Depth Plot in Radioactive Sandstone

The depth plots shown above may be considered as pretty crude by log analysis standards, and they were never intended to replace conventional depth plots such as the one shown in Figure 14.15 for the Halfway sand example shown earlier. This plot was created from a standalone log analysis plotting program, LAS/PLOT, which reads LAS files created by META/LOG spreadsheets.


Figure 14.15: Depth Plot of a Spreadsheet Analysis

14.10 In Conclusion
Spreadsheet analysis of logs with an expert system to reduce complexity is a viable approach to reducing the burden of quantitative log analysis. It provides sophisticated analysis at low cost, is friendly and easy to use and can be custom tailored to suit the needs of individual analysts or problems.

Limitations on spreadsheet memory usage and limited graphics, especially for depth plots suggest that this form of computer aided analysis will not replace stand alone special purpose programs. But spreadsheets are certainly much better than all programmable calculator methods at only a modest increase in cost.

14.11 Addendum
"Spreadsheet Analysis of Logs" was presented at the 10th Formation Evaluation Symposium, Canadian Well Logging Society, September, 1985.It is reproduced here to show the progress in spreadsheet software since 1985.

Introduction

The tools of the log analysis trade have evolved rapidly over the last 20 years, from charts and nomographs, to slide rules, to programmable calculators, to desktop micro computers. Each method has its drawbacks. Charts are not very accurate or repeatable. Slide rules are difficult for some people to use and don't add or subtract very well. Let's face it, these methods are as obsolete as the ES log.

Programmable calculators are easy to use but not too easy to program. Desktop computers (or terminals to larger computers) may or may not be easy to use, but are almost invariably programmed by others, so custom analysis is difficult.

Another solution has been available for about seven years, but only recently have computers and software been powerful enough, large enough, fast enough and cheap enough to do log analysis. This solution is the electronic spreadsheet on modern microcomputers, typified by products like Visicalc, Supercalc, Context MBA, and Lotus 1-2-3.

Early versions of Visicalc could do log analysis, but memory and speed limitations made it impractical. Today, however, these limitations have been removed with the advent of professional desktop computers like the IBM-PC/XT, the HP 150 and HP 200 series, and many comparable machines.

What Is An Electronic Spreadsheet?

An electronic spreadsheet is a computerized replacement for the pencil and columnar pad so familiar to log analysts who do hand calculations at the well site or in the office. One virtue of the electronic spreadsheet is that it can be much larger than a sheet of paper, often allowing up to 256 columns of data (with up to 72 characters per column) by 2000 rows long. The screen or monitor of the computer is a window on this large array of data.

Each row is assigned a row number and each column is headed by a column letter. The intersection of a row and column is termed a cell. Cells are identified by their row and column designation (e.g. C24).

Movement around the portion of spreadsheet seen on the screen is performed by moving a cursor using the keyboard, lightpen or mouse attached to the computer.

Cells can contain four kinds of data:

1. text or labels 2. numbers (raw data or answers) 3. formulae or algorithms 4. spreadsheet functions

Mathematical formulae in spreadsheets show relationships between cells (e.g. C1=+A1+B1). If the relationship of many cells is similar (e.g. C1=+A1+B1, C2=+A2+B2, etc.) then each individual relationship must be described. There are semi-automatic methods for doing this in most spreadsheets. There are no general purpose array operations such as add column A to column B to get column C.

Spreadsheet functions are shortcuts which help the user perform common mathematical computations, such as SUM, AVERAGE, MINIMUM, MAXIMUM, IF...THEN, and PRESENT VALUE. Some of these are not even available in conventional programming languages but must be coded uniquely each time they are needed.

The act of building a log analysis model with a spreadsheet defines all the mathematical relationships in the model, as well as the raw data and analysis parameters. You can edit formulae, insert rows or columns, fix typographical errors, enter new data, and re-compute results without changing a Basic or Fortran program. The spreadsheet is the program!

You can also play "what if?". You could vary the water resistivity, matrix or fluid parameters, or shale values and see the change in the log analysis results in 10 to 30 seconds. Computations can be revised immediately after entering a change, or after entering all changes, as decided by the user.

The original spreadsheet, after testing and debugging, can be used immediately, or saved as a standard analysis package. Standard spreadsheets can be customized to fit individual problems when they are used. Managers or professionals may create spreadsheets for their own use, or for use by other professional or technical staff in other departments.

Spreadsheet software is the most widely used application for microcomputers. Over 500,000 copies of Visicalc have been sold, and there are more than 60 competing products, many with implementations on five or more microcomputers. Table 1 and 2 indicate some of these products and their suppliers.

What To Consider In A Spreadsheet Package

A warning note should be issued. Spreadsheet software is no panacea. It is handy for small jobs needing quick turn around, or those with a lot of "what if?" situations - the same kind of jobs you would do with charts or a calculator. The spreadsheet will not replace dedicated software on microcomputers or mainframes. One of the reasons for this is the difficulty in plotting results in conventional log analysis format (3 track or 4 track presentations).

Many spreadsheet packages do have integrated graphics, and are capable of illustrating most common crossplots (on linear axes) and can do a limited form of depth plot. A few packages have integrated word processing, so final reports can be constructed after the analysis has been completed.

Not all spreadsheets will work on all computers, nor are all computers large enough or fast enough to be useful. The minimum configuration is usually a computer with 256,000 bytes of memory, a dual disc drive with one to five megabyte storage, a CRT with at least 19 lines and 80 characters, an 80 character dot matrix printer, and a plotter (optional). All these components must be supported by the computer's operating system, as well as the spreadsheet software. A digitizer could be added, but custom software, probably in machine language or assembler language, will be required.

The minimum cost for a practical system would be in the order of: computer C$ 6,000, dual disc drive 2,000, printer 700l plotter 2,300 for a total of $11,000

The spreadsheet software is often bundled with the hardware, but would cost less than $700 in any case.

Each spreadsheet package utilizes its own syntax (just as the variants of Basic or Fortran do), and interchange of models from one system to another may require some editing. The examples in this Chapter use the language of Lotus 1-2-3, one of the more sophisticated packages. However, the common commands are similar to Visicalc and conversion should be relatively easy.

One important consideration when choosing a spreadsheet package is whether or not the package supports forward referencing. The older packages, or cheaper packages, do not permit calculations of a cell if it refers to a cell not yet calculated. These programs start at the top left corner and work either horizontally or vertically.

Those that permit forward referencing search the spreadsheet for the key cell and work out from there. This allows the spreadsheet layout to be more flexible and more attractive to the user. Little used columns could be way off screen to allow more valuable columns, such as answers, to be closer to the window. This form of calculation is termed natural computation.

Another necessary feature involves security. There are two forms and both are desirable - hidden cells and protected cells. Hidden cells have a zero width, so formula or data in them cannot be seen. They can still be edited or even made visible, but it is unlikely that an accidental entry to such a cell will be made. Protected cells may be visible or hidden, but cannot be entered by the user to accidentally or erroneously change a key formula, or even lose it permanently. The originator of the spreadsheet can obviously unprotect a cell and modify the contents.

Another area of interest is the ability to use "macros." Macros are miniature programs which can be invoked by a single key stroke. They are stored with the spreadsheet and loaded when the spreadsheet is loaded. They could be used, for example, to print only the desired columns of a larger spreadsheet.

Advantages Of Spreadsheet Analysis

The advantages of spreadsheet analysis can be stated simply - speed, cost, ease of use, familiarity, and limited programming skills required.

A typical data set of ten depth points and five data curves can be entered in about two minutes by even a poor typist. The calculations take about ten seconds. Three or four crossplots and results can be printed in less than two minutes. Total elapsed time is five minutes. A large data array - say fifty depth points - can be entered, computed and printed in about 20 minutes.

Progressive engineers and geologists familiar with microcomputers and spreadsheets, can learn to use such a package in less than an hour of practice. Modifying a spreadsheet or creating new ones for specialized analysis should take only one day's practice, and thereafter a few minutes to an hour may be needed to tune the spreadsheet to a particular problem.

The spreadsheet example illustrated later took about 60 hours to write, test and debug, but modifications to this basic structure take only minutes. The user must be familiar with the spreadsheet commands, however.

Because the screen layout, printed results and data structure are one and the same, the spreadsheet contents become familiar quickly. The data sheet is always available for viewing, compared to log analysis packages in which the data structure is invisible to the user.

Other factors, such as built in data management, file storing and retrieving, graphics, simple and friendly keystroke sequences, make spreadsheets more attractive than writing Basic or Fortran programs.

The software is usually available free of charge; that is, it is part of an existing micro computer system already in the work place. Even if it is not, it is a low cost item when compared to stand alone packages priced from $10,000 to $100,000.

Limitations of Spreadsheet Software (1985)

The concept of spreadsheet software was developed for business and financial calculations. As a result, scientific use is not perfect. The following lists some of the problems encountered in developing the present program.

A cell can contain only one statement or formula. Therefore, these statements can be extremely complicated and unreadable if they are to be powerful enough to do the task of a multi-line subroutine. For example, to select the desired porosity option from the several which may be computed requires a statement over 100 characters long and with 13 closing brackets at the end. Similarly, the statement to find the minimum shale volume (and not crash when no shale data is available) is 205 characters long. Lotus 1-2-3 allows statements to be 240 characters long, but previous programming experience helps, as does the ability to build complicated statements piece by piece.

Cells can be referenced by absolute locations or by relative addresses. Interpretation parameters are logically stored in absolute addresses, so you always know where they are. However, relative locations for log data curves can create unforeseen problems. For example, it is tempting to use the editing feature of spreadsheets to move a column of data up or down (depth shifting). However, any references to this data in equations will also be shifted, so that data from two different depths end up being used for a single depth. This can be solved by more absolute addresses in the equations, creating a new problem. Adding lines of data to allow for a larger zone becomes impossible unless one uses named data ranges - but it is starting to get unwieldy and complicated at this stage.

Other concepts of editing also fail. For example, re-scaling a curve (for sonde error or a units conversion) is simple. Just copy a formula into each cell and the new value is automatically computed. But if you leave the formula in the cell, it will re-compute each time the job is run - not what you wanted. You might consider putting the rescaling equation into a different cell and feed the answer back to the original cell, then delete the rescaling cell. This doesn't work either, since the original cell still contains references to the deleted cell - it never contains the answer you want unless you type the number in yourself.

Arithmetic errors propagate rather badly. If a divide by zero (or similar) error occurs, the term ERR will show in the answer, as well as in every answer that depended on the original error. This is shown by the term ERR appearing in each formula that was affected by the error ~ thus losing the contents of totally innocent formulae. This might happen to 400 or 500 cells due to some very trivial data error. The situation will rectify itself, and all the ERRs will disappear if you fix the originating error. However, since so many equations show ERR, it is not always possible to find the offending equation on the first pass. If you don't, all is lost, and a lot of retyping will ensue.

The graphics programs are general purpose business oriented, and two dimensional, They create reasonable 2-D crossplots, but can not handle Z-plots or 4-D plots. Depth plots can not be made in log analysis format or to a specific depth scale, Even if asked to make a +0.45 to -0.15 scale for density porosity, the program insists on making a -0.20 to +0.50 scale. Depth plots were labeled in scientific notation (2.050E00 for 2050 meters) even when a fixed format was requested. These esthetic problems could not be overcome, and you can't phone your friendly local software supplier for a quick fix over the phone.

Another serious limitation is the inability to use text strings in a formula, or to generate text strings from an equation , or even from a lookup table, It would be nice, for example, to output lithology codes as words or to use the log type or log units in words to switch logic in the program, as Basic and Fortran allow. There is no way to see text on the screen unless you type it there, or use a macro to type it for you.

Spreadsheet size is also a problem. The examples presented here have the usual shale, porosity, saturation, permeability, and lithology models, as well as six crossplots and a hydrocarbon summary. For ten lines of data, the spreadsheet takes up about 70,000 bytes. A set of fifty data points takes a bit more than 260,000 bytes. Conventional Basic or Fortran data files would take much less space. Program "code" is stored with each data file, instead of only once for conventional programs.

The lack of a true DO loop is frustrating and accentuates the space problem. In order to obtain results for three sets of cutoffs, for example, you could run the job three times and print the answers three times. Or all three sets could be coded into the spreadsheet. This takes considerable space, since each cutoff set requires five columns of equations , each the full length of the data set. A DO loop would eliminate the need to code each iteration through the loop. Lotus 1-2-3 does have an iteration mode, but only one variable can be used as the iteration test - not enough to do cutoffs on four or five variables. Indirect addressing of cells would help reduce this problem also, but there doesn't seem to be any way to use the result in one cell to address another cell.

Examples of Spreadsheet Analysis

The examples shown in the illustrations are from one well in which the Halfway sand was analyzed (same data as Appendix Two). The Halfway is characterized by anhydrite and dolomite in a shaly sand environment. Analysis requires a complex lithology model. The example was treated first as a hand entry job with eleven lines of data of unequal thickness. A second presentation was made using equally spaced data from a digitizer. Interpretation parameters, raw data, results, net pay, crossplots, and depth plots were printed from the spreadsheet. Little difference in hydrocarbon volume is evident between the two approaches. See illustrations in Section 14.09.

The analysis model uses slightly simplified versions of the algorithms used in this handbook. Plots are graphic dumps of the screen contents to the printer. More attractive, but slower, plots can be prepared using the utility programs supplied by Lotus 1-2-3 to drive a pen style colour plotter or colour printer.

In fact, this spreadsheet program has most of the features of LOG/MATE except for the length of data files, processing speed (five times slower), quality depth plots, 3 and 4-D crossplots, and many of the unique data editing features. Not bad for a $700 program.

The spreadsheet program is available by contacting the author.

Conclusion

Spreadsheet analysis of logs is a viable approach to reducing the burden of quantitative log analysis. It provides sophisticated analysis at low cost, is friendly and easy to use, and can be custom tailored to suit the needs of individual analysts or problems. Sufficient limitations on existing spreadsheet capabilities exist to suggest that this form of computer aided analysis will not replace stand alone special purpose programs. But spreadsheets are certainly better than nearly all programmable calculator methods at only a modest increase in cost.

Micro computers and spreadsheet software are becoming ubiquitous in engineering offices, so it is likely that more and more log analysis will be done by this method. The examples presented here should provide a good starting point for any one interested in pursuing this line of analysis.

TABLE 1: THE TEN MOST POPULAR SPREADSHEET PROGRAMS - 1985

Name Integrated Packages / Supplier


VisiCalc VisiFile Visicorp VisiCalc Advanced VisiTerm VisiTrend VisiWord VisiPlot San Jose CA 95134 408-946-9000


SuperCalc SuperChart SuperCalc 2 SuperWriter Sorcim Corp San Jose CA 95131 408-942-1727


CalcStar WordStar SpellStar MicroPro Corp San Raphael CA 94903 MailMerge 415-499-1200


Multiplan Nil Microsoft Corp Bellevue WA 98004 206-828-8080


PerfectCalc PerfectFiler PerfectSpeller Perfect Software Berkeley CA 94710 PerfectWriter 415-527-2628


ProCalc ProGraph ProOp Software Products San Diego CA 92121 LogiQuest 714-450-1526


Context MBA word processor data comm Context Mgmt Syst Torrance CA 90505 graphics 213-378-8277 data base


Lotus 1-2-3 graphics data base Lotus Corp Cambridge MA 02138 617-492-7171

TABLE 02 SOME OTHER SPREADSHEET SUPPLIERS

Name Supplier
BudgetPlan Dangen Corp
   
Cope-PC Plenary Systems Inc.
   
Desktop Plan Visicorp
   
Easy Calc Norell Data Systems
   
Electronic Spreadsheet American Planning Corp.
   
Ferox Modeler Ferox Microsystems
   
Finar Finar Research
   
Graph'n 'Calc Desktop Computer Software
   
LogiCalc Software Products International
   
Master Planner Comshare Target Software
   
Microplan Chang Laboratories Inc.
   
MiniModel Westico Inc.
   
Plan 80 Business Planning Systems Inc.
   
Scratch Pad Super Soft
   
The Thinker Texas Soft
   
Knowledgeman Micro Data Base Systems
   
Vision Calc Visicorp
   
Graph Plan Chang Laboratories Inc.
   
The Planner Hayden Book Corp.
   
Peach Colc Peachtree Software
   
Super Comp-Twenty Access Technology Inc.
   
Lisa Calc Apple Computer
   
Magi Calc Artsci Inc.
   
Senior Analyst Business Solutions Inc.
   
Basic Plan Cado Systems
   
Ultra Calc CIE Systems
   
Calc Result Computer Marketing Service
   
Planner Calc Comshare Target Software
   
Target Planner Comshare Target Software
   
Superscreen Creative Software Concept
   
Plan Master Cromemco Inc.
   
Business Planner Duosoft Corp.
   
Execu/Model Executic Software
   
Master Calc Four-M Marketing