13 Web cartography

If we revisit one of the definitions of cartography, it will be easier to define Web cartography: “Cartography is a scientific discipline which studies methods of producing and ways of using maps.” What separates Web cartography from traditional cartography is its “limitation” to the Web as a medium. Most of the previously obtained cartographic knowledge remained valid after the appearance of the Web.

Web cartography represents a branch of cartography studying techniques of designing and implementing maps on the Web, ways of using Web maps, etc. In a narrow sense, web cartography studies technologies of design, implementation, and dissemination of maps on the Web.

A map represents an abstraction and selection of geographic reality with clearly highlighted spatial shapes and relations. However, a Web map can have new functions:

  • Search engine,

  • Search capabilities related to the local infrastructure of spatial data,

  • Interface for accessing other geographic and non-geographic data,

  • Multimedia,

  • Use in collaborative mapping (e.g., crowdsourcing),…

Examples of Web map applications:

Before moving on to the typical architecture of a Web map portal, a basic overview of key terms related to Internet technologies will be provided.

13.1 Introduction to Internet technologies

The Internet represents the largest and most significant network of today. This network connects a large number of different networks and computers over the world. There is no unique definition of the Internet, but two groups of descriptions of the Internet can be found in literature: a structural and a functional description of the Internet.

From a structural viewpoint, the Internet is defined via the hardware, communication, and software components that comprise it. From this viewpoint, the Internet is a Wide Area Network (WAN), i.e., a network connecting a large number of smaller private or public networks. The Internet enables computers and other devices connected to these networks to communicate with each other. Communication channels are made up of very different physical communication technologies (different types of cables, wireless and satellite connections, etc.). End-point computers are also called host computers. Typically, there are only indirect connections between host computers via devices called routers. The structure of the Internet is hierarchical: host computers are connected into a network of their local Internet Service Providers (ISPs), local ISP devices are connected into regional networks, regional networks are connected into national and international networks, etc. Both host computers and routers obey the Internet Protocol (IP) for communication which, among other things, assigns each of them a unique logic address called the IP address. IP defines the possibility of sending information packages between hosts and routers. Information packages travel from host to host via an automatically determined sequence of routers with hosts not having any control over the travel path of the package. Software installed on host computers offers users different service on the network.

From a functional viewpoint, the Internet is defined via the services it offers to its users. From this viewpoint, the Internet is a network infrastructure which enables operation on distributed applications used by the users. These applications include the Web which enables users to view hypertext documents, e-mail, data transfer (FTP, SCP) between computers, remote access to other computers (TALNET, SSH), etc. Over time, an increasing number of applications is set up. These applications communicate between each other via their specific application protocols (HTTP, SMTP, POP3, etc.).

The Internet offers a wide range of service, the most famous of which is the Web service. This Internet service originated in the 1990s and is currently the most important Internet service. It represents a system of interconnected documents known as Web pages which can contain text, images, videos, and other multimedia material. Web pages are connected using links, i.e., they represent hyperlinks. Users activate link (simply using a click of the computer mouse) to move from one page to another.

Pages are stored on specialized Web servers and at the request of clients are transferred to their computers where specialized programs display them. These programs represent Web browsers. Today, the most popular Web browsers are Google Chrome, Mozilla Firefox, Microsoft Internet Explorer, Safari, and Opera.

The basic reason for the success of the Internet is the definition of and adherence to a standard communication protocol. Protocols define the way in which computers, i.e., applications on those computers, can communicate regardless of hardware platforms and operating systems on different computers in the network.

13.1.1 HTTP

Hypertext Transfer Protocol (HTTP) represents the foundation of the Web. HTTP is implemented in two types of programs: client programs, typically Web browsers, and server programs, typically Web servers. These programs communicate with each other by exchanging HTTP messages. HTTP defines the structure of these messages and the way in which clients and servers exchange them. The Web basically represents a distributed application based on Web pages. Web pages are comprised of objects – hypertext documents described in HTML, images of different format (e.g., JPG, PNG, GIF), Java applets, etc. Each individual object has its own Web address in the form of a Uniform Resource Identifier (URL).

HTTP functions by clients establishing a TCP connection (typically on port 80) with a server and then sending HTTP requests for certain Web objects to the server.

HTTP requests and responses follow a strictly specified format.

An HTTP request is sent after the TCP connection with a host computer has been established. In the first line, the method, path to the requested object on the server, and the HTTP version are provided.

There are several methods, with GET and POST being the most popular, followed by HEAD.

HTTP requests contain a large number of fields and values with which the client tells the server relevant information. After receiving the HTTP request, the server sends an HTTP response.

The exchange of information between the Web client and Web server continues in accordance with HTTP using HyperText Markup Language (HTML) and eXtensible Markup Language (XML) as structuring languages.

The GET method sends a request and data via a URL specific to the resource available on the server. Typically, the response contains data which the resource can service. POST methods do not send data through URLs but through the attribute of the send() method; this makes it safer and useful when confidential data is sent (usually entered by the user). The POST method is often used when data is sent to a server resource which enables it. The HEAD method requests headers from the server, from a specified URL without document content (e.g., to check the date of a resource update).

13.1.2 HTML

HyperText Markup Language (HTML) is a language for creating hypertext documents (structures comprising interconnected units of information displayed on an electronic device) and represents one of the bases of the Web.

HTML is a descriptive language used to define the content of Web pages. It is developed and maintained by the World Wide Web Consortium (W3C). At the moment of writing, the current version is HTML 5.0. Using HTML, elements of a Web page can be easily separated, such as titles, paragraphs, references, images, and tables. Hence, HTML is a strictly descriptive language although certain dynamic aspects of Web pages can be obtained using predefined tags in the current HTML 5.0 version (multimedia content, scalable graphics). The advantage of the HTML standard is the complete independence of the language from the platform on which a Web page described using HTML is displayed. A drawback of HTML is the limited number of tags, making it inefficient for describing complex data, as well as not enabling the use of user-defined tags, which would serve specific users’ needs.

13.1.3 XML

The eXtensible Markup Language (XML) is a meta markup language created in the mid-1990s as a result of the need for a language that would be similar to the precursor of HTML – the Standard Generalized Markup Language (SGML). XML is a language which is simple to parse and process. When writing documents in XML, strict rules defined by the standard must be followed. Some of the most significant characteristics of XML are the following:

  • XML is an Internet language,

  • XML supports a wide variety of applications,

  • Writing programs that process XML documents (ASCII files) is simple,

  • XML does not contain optional and arbitrary parts,

  • XML are readable and clear to people,

  • XML documents are created very easily.

In time, a large number of applications, i.e., XML dialects, appeared. Some of them are

  • XHTML – used for hypertext documents,

  • MathML – used for mathematical content,

  • SMIL – used for multimedia content,

  • SVG – used for vector graphics,

  • SOAP – used for Web services,

  • GML – used for exchange and structuring of spatial data.

Taking into account the strict syntax of XML documents, in time it became a language used for writing various structured and semi-structured information, and specialized databases, based on storing information in XML documents and using specialized search languages (Xpath, XQuery).

Therefore, XML enables the exchange of data between incompatible systems, interoperability, and communication. XML is used for storing data in files and databases, creating new languages, e.g., the Wireless Markup Language (WML) which facilitates sharing data through information systems, especially those connected to the Internet.

13.1.4 JavaScript/ECMAScript

JavaScript is the most popular language for writing client Web scripts. It was developed by the company Netscape and very quickly became intensely used for adding dynamic features to Web pages. Quickly afterwards, Microsoft added support for its language to Internet Explorer, but, due to legal reasons, called the dialect Jscript. The original name of the language in Netscape was LiveScript which was supposed to indicate that the purpose of the language was the production of dynamicm, “live” Web pages. Almost at the same time, the company Sun Microsystems developed the language Java which gained large popularity on the Internet since it was easy to write Java applets in it, whereas Netscape added applet support into their Web browser. Somewhat later, Netscape renamed their language into JavaScript so as not to decrease the popularity of Java.

JavaScript represents a language which is today implemented and, to a degree expanded, by the ECMAScript standard defined in documents ECMA-262 and ISO/IEC 16262. The language standardization was done under the auspices of the European Computer Manufacturers Association (ECMA). ECMA is an industrial association founded in 1961 dedicated to the standardization in the field of Information and Communication Technology (ICT) and Consumer Electronics (CE).

JavaScript syntax is largely inspired by the Java syntax, in turn inspired by the syntax of C and C++. A large number of names are taken from Java. The language is case-sensitive. JavaScript is a dynamic and weekly typified language combining several programming paradigms, most notably procedural and object-oriented ones. Programming in JavaScript relies heavily on using standard library objects as well as Document Object Model (DOM) objects. JavaScript code is entirely developed within the process of Web browsers and nothing outside of the browser can be accessed from it, which is important for security reasons, e.g., stopping malicious scripts from accessing parts of the client machine file systems.

Today, JavaScript is developed by the Mozilla Foundation, considered to be the successor of Netscape.

More details about Internet technologies (in serbian language) can be found at http://poincare.matf.bg.ac.rs/~filip/uvit/.