String (computing): Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Ed Poor
m (grammar)
imported>Tom Morris
No edit summary
Line 1: Line 1:
{{subpages}}
{{subpages}}


In [[computing]] and more specifically in various [[programming languages]], '''strings''' are a variable type that can hold text<ref>{{cite web|
In computer [[programming languages]], a '''string''' is a data type which consists of a list of characters arraged together into a string. In some languages, a string is simply a list of characters with some convenient helper methods that make strings more like blocks of text. As a list, many programming languages let you use array or list processing methods on strings - getting the n-th member of the list will return the n-th character in the string.
url=http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html |
title=Java API Documentation: Strings |
author=Sun Microsystems |
accessdate=2009-07-04
}}</ref>, as opposed to integer variables (so called int variables) that can only hold integers (e.g. a number, such as 5) or a "float" variable, which can hold floating point numbers (e.g. numbers with decimal places - 5.5, 6.3, etc.)


==Various implementations of the String variable type==
With most traditional text encoding methods, each member of the list represents a single character, single byte piece of data. So the word 'hello' would contain five characters and thus five bytes. With the introduction of [[Unicode]], many programming languages now support multibyte string encoding, where some letters are single bytes and others are multiple bytes. In the [[Java (programming language)|Java]] programming language (and many languages which run on the Java platform: [[JRuby]], [[Scala (programming language)|Scala]], Groovy etc.), strings can contain Unicode characters and all the string methods are multibyte aware. The [[Python (programming language)|Python]] programming language has a separate Unicode datatype. The [[Ruby (programming language)|Ruby]] language can support multibyte string encoding in later versions or by using extra libraries.
Some languages, such as [[Java programming language|Java]], do require the developer to declare a variable as a String type. Other languages, such as [[Python programming language|Python]] automatically "type cast" their variables. This can be helpful or it can also get in the way - if the number 1 is type cast by Python as a string (because it is in a [[list]] for example), the developer has to consciously convert that string variable to an 'int' type before any math functions can be performed.


Some developers prefer to type case their own variables (such as in Java, [[C programming language|C]] or [[C++]]), while some prefer the automatic type casting that Python does because it can simplify a program. Anyone who has had to debug a Python script to determine that somehow "Python decided this variable was a string instead of an int" understands why some developers become frustrated with dynamic type casting.
Strings can be implicitly or explicitly converted into other datatypes depending on the programming language. Consider the following statement:


A Python string:
<code>print "My favourite number is " + 5</code>
<pre>MyPythonString = "This is a string"</pre>


A Java string:
In many languages, the 5 literal will represent an integer. It will be automatically cast into a string '5' and appended to the prior string. Now consider the following:
<pre>String MyJavaString = "abc";</pre>


Note the difference here - in Python a variable is simply declared, and Python "figures out" if that variable is a string based on its content. In Java (the second example), the keyword String tells Java that the string variable MyJavaString is about to be set.
<code>if (10 == "10") { /* ... */ }</code>
 
In some languages, this conditional will not be satisified. The conditional is comparing an integer and a string, and the types do not match. But for many uses, this kind of matching is pedantic and unnecessary. If the string had been converted into an integer, it would be equal to the integer it is being compared with. Similarly, if the integer had been converted into a string, it would be equal to the string it is being compared with.
 
This kind of conversion is called [[implicit conversion]], and some languages ([[Scala (programming language)|Scala]], for instance) allow one to describe how said implicit conversions happen by declaring implicit type conversion functions.


==References==
==References==
{{reflist|2}}
{{reflist|2}}

Revision as of 15:30, 15 April 2010

This article is a stub and thus not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

In computer programming languages, a string is a data type which consists of a list of characters arraged together into a string. In some languages, a string is simply a list of characters with some convenient helper methods that make strings more like blocks of text. As a list, many programming languages let you use array or list processing methods on strings - getting the n-th member of the list will return the n-th character in the string.

With most traditional text encoding methods, each member of the list represents a single character, single byte piece of data. So the word 'hello' would contain five characters and thus five bytes. With the introduction of Unicode, many programming languages now support multibyte string encoding, where some letters are single bytes and others are multiple bytes. In the Java programming language (and many languages which run on the Java platform: JRuby, Scala, Groovy etc.), strings can contain Unicode characters and all the string methods are multibyte aware. The Python programming language has a separate Unicode datatype. The Ruby language can support multibyte string encoding in later versions or by using extra libraries.

Strings can be implicitly or explicitly converted into other datatypes depending on the programming language. Consider the following statement:

print "My favourite number is " + 5

In many languages, the 5 literal will represent an integer. It will be automatically cast into a string '5' and appended to the prior string. Now consider the following:

if (10 == "10") { /* ... */ }

In some languages, this conditional will not be satisified. The conditional is comparing an integer and a string, and the types do not match. But for many uses, this kind of matching is pedantic and unnecessary. If the string had been converted into an integer, it would be equal to the integer it is being compared with. Similarly, if the integer had been converted into a string, it would be equal to the string it is being compared with.

This kind of conversion is called implicit conversion, and some languages (Scala, for instance) allow one to describe how said implicit conversions happen by declaring implicit type conversion functions.

References