Thursday, September 16, 2010

What Is Data Processing, What Is Data Representation?

When people communicate by writing in any language, the symbols used (the letters of the alphabet, numerals, and punctuation marks) convey information. The symbols themselves are not information, but representations of information. Data in an EDPS must be expressed symbolically so that the machines can interpret the information presented by humans. In general, the symbols that are read and interpreted by a machine differ from those used by people. The designer of a computer system determines the nature and meaning of a particular set of symbols that can be read and interpreted by the system. The actual data that is used by these systems is (or was in the past) presented as holes on punched cards or paper tape, as spots on magnetic tape, as bits (binary digit) or bytes of information in a disk, diskette, CD-ROM, or optical disk; as magnetic-ink characters; as pixels in display-screen images; as points in plotted graphs; or as communication-network signals.
In many instances, communication occurs between machines. This communication can be a direct exchange of data in electronic form over cables, wires, radio waves, infrared, satellites or even wireless devices such as cellular phones, pagers, and hand-held personal organizers and/or notebooks. It can also be an exchange where the recorded or stored output of one device or system becomes the input of another machine or system.
In the computer, data is recorded electronically. The presence or absence of a signal in specific circuitry represents data in the computer the same way that the absence or presence of a punched hole represented data in a punched card. If we think of an ordinary lightbulb being either on or off, we could define its operation as a binary mode. That means that at any given time the lightbulb can be in only one of two possible conditions. This is known as a "binary state." In a computer, transistors are conducting or nonconducting; magnetic materials are magnetized in one direction or in the opposite direction; a switch or relay is either on or off, a specific voltage is either present or absent. These are all binary states. Representing data within the computer is accomplished by assigning a specific value to each binary indication or group of binary indications. Binary signals can be used to represent both instructions and data; consequently the basic language of the computer is based primarily on the "binary number system."
A binary method of notation is usually used to illustrate binary indications. This method uses only two symbols: 0 and 1, where 0 and 1 represent the absence and presence of an assigned value, respectively. These symbols, or binary digits, are called "bits." A group of eight bits is known as a "byte," and a group of 32 bits (4 bytes) is known as a "word." The bit positions within a byte or a word have place values related to the binary number system. In the binary number system the values of these symbols are determined by their positions in a multidigit numeral. The position values are based on the right to left progression of powers having a base of 2 (20, 21, 22, 23), commonly employed within digital computers. For example, if there are four light bulbs next to each other numbered 4, 3, 2, and 1 and 1 and 3 are "on" and 2 and 4 are "off," the binary notation is 0101.

The system of expressing decimal digits as an equivalent binary value is known as Binary Coded Decimal (BCD). In this code, all characters (64 characters can be coded), including alphabetic, numeric, and special signs, are represented using six positions of binary notation (plus a parity bit position). The Extended Binary Coded Decimal Interchange Code (EBCDIC) uses eight binary positions for each character format plus a position for parity checking (256 characters can be coded). The American Standard Code for Information Interchange (ASCII) is a seven-bit code that offers 128 possible characters. ASCII was developed by users of communications and data processing equipment as an attempt to standardize machine-to-machine and system-to-system communication.
Computer Number Systems and Conversions. Representing a decimal number in binary numbers may require very long strings of ones and zeros. The hexadecimal system is used as a shorthand method to represent them. The base of this system is 16, and the symbols used are: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E and F. In other words, F is 15 in decimal notation and 1 1 1 1 in binary.
Programming Languages Techniques. Assembler languages are closer to machine instructions than to human language, and having to express logical procedures, arithmetical calculations, and textual manipulations in these languages affects a programmer's productivity because they are so cumbersome. There are many higher-level programming languages, such as ALGOL, BASIC, COBOL, FORTRAN, and Pascal, that are much closer to human means of expression.
A programmer writes a source program in a human-readable programming language. A compiler translates these English-like statements into instructions that the computer can execute—such instructions are called an "object program." Through added library routines the computer does further processing of the object program, executes it, and an "output" is produced. There are some "optimizing compilers" that automatically correct obvious inefficiencies in source programming. Sometimes, with the use of "interpreters," debugging can be done to a program as it executes the user program piece by piece. MUMPS, LISP, and APL are interpreters used for this purpose in the health care environment, artificial intelligence, and mathematics fields, respectively. Because of the time and costs associated with development, it is generally not cost effective in today's environment to develop an application package, but rather buy it (if available) from a vendor. The costs are thus spread among thousands of users. Typical applications packages used for public health purposes are SAS and SPSS (for biostatistics) and ArcView/GIS (for Geographical Information Systems). In addition there are some data manipulation languages (e.g., Oracle and dBASE) that were written with this purpose. A database manipulation language (DML) is a special sublanguage used for handling data storage and retrieval in a database system. Using a data definition language (DDL), programmers can organize and structure data on secondary storage devices.
Data Acquisition. Capturing and entering data into a computer is expensive. Direct acquisition of data avoids the need for people to read values and measure, encode, and/or enter the data. Automated data acquisition can help eliminate errors and speed up the procedure. Sensors connected to a patient convert biological signals into electrical signals that are transmitted into a computer. Many times these signals (e.g., ECG, blood pressure, heart rate) are analog signals, and in order to be stored into a digital signal a conversion needs to occur. This process is called analog to digital conversion (ADC).

No comments:

Post a Comment