1
|
*******************************************************************************
|
2
|
* *
|
3
|
* IDNA Convert (idna_convert.class.php) *
|
4
|
* *
|
5
|
* http://idnaconv.phlymail.de mailto:phlymail@phlylabs.de *
|
6
|
*******************************************************************************
|
7
|
* (c) 2004-2007 phlyLabs, Berlin *
|
8
|
* This file is encoded in UTF-8 *
|
9
|
*******************************************************************************
|
10
|
|
11
|
Introduction
|
12
|
------------
|
13
|
|
14
|
The class idna_convert allows to convert internationalized domain names
|
15
|
(see RFC 3490, 3491, 3492 and 3454 for detials) as they can be used with various
|
16
|
registries worldwide to be translated between their original (localized) form
|
17
|
and their encoded form as it will be used in the DNS (Domain Name System).
|
18
|
|
19
|
The class provides two public methods, encode() and decode(), which do exactly
|
20
|
what you would expect them to do. You are allowed to use complete domain names,
|
21
|
simple strings and complete email addresses as well. That means, that you might
|
22
|
use any of the following notations:
|
23
|
|
24
|
- www.nörgler.com
|
25
|
- xn--nrgler-wxa
|
26
|
- xn--brse-5qa.xn--knrz-1ra.info
|
27
|
|
28
|
Errors, incorrectly encoded or invalid strings will lead to either a FALSE
|
29
|
response (when in strict mode) or to only partially converted strings.
|
30
|
You can query the occured error by calling the method get_last_error().
|
31
|
|
32
|
Unicode strings are expected to be either UTF-8 strings, UCS-4 strings or UCS-4
|
33
|
arrays. The default format is UTF-8. For setting different encodings, you can
|
34
|
call the method setParams() - please see the inline documentation for details.
|
35
|
ACE strings (the Punycode form) are always 7bit ASCII strings.
|
36
|
|
37
|
ATTENTION: We no longer supply the PHP5 version of the class. It is not
|
38
|
necessary for achieving a successfull conversion, since the supplied PHP code is
|
39
|
compatible with both PHP4 and PHP5. We expect to see no compatibility issues
|
40
|
with the upcoming PHP6, too.
|
41
|
|
42
|
|
43
|
Files
|
44
|
-----
|
45
|
|
46
|
idna_convert.class.php - The actual class
|
47
|
idna_convert.create.npdata.php - Useful for (re)creating the NPData file
|
48
|
npdata.ser - Serialized data for NamePrep
|
49
|
example.php - An example web page for converting
|
50
|
ReadMe.txt - This file
|
51
|
LICENCE - The LGPL licence file
|
52
|
|
53
|
The class is contained in idna_convert.class.php.
|
54
|
MAKE SURE to copy the npdata.ser file into the same folder as the class file
|
55
|
itself!
|
56
|
|
57
|
|
58
|
Examples
|
59
|
--------
|
60
|
|
61
|
1. Say we wish to encode the domain name nörgler.com:
|
62
|
|
63
|
// Include the class
|
64
|
include_once('idna_convert.class.php');
|
65
|
// Instantiate it *
|
66
|
$IDN = new idna_convert();
|
67
|
// The input string, if input is not UTF-8 or UCS-4, it must be converted before
|
68
|
$input = utf8_encode('nörgler.com');
|
69
|
// Encode it to its punycode presentation
|
70
|
$output = $IDN->encode($input);
|
71
|
// Output, what we got now
|
72
|
echo $output; // This will read: xn--nrgler-wxa.com
|
73
|
|
74
|
|
75
|
2. We received an email from a punycoded domain and are willing to learn, how
|
76
|
the domain name reads originally
|
77
|
|
78
|
// Include the class
|
79
|
include_once('idna_convert.class.php');
|
80
|
// Instantiate it (depending on the version you are using) with
|
81
|
$IDN = new idna_convert();
|
82
|
// The input string
|
83
|
$input = 'andre@xn--brse-5qa.xn--knrz-1ra.info';
|
84
|
// Encode it to its punycode presentation
|
85
|
$output = $IDN->decode($input);
|
86
|
// Output, what we got now, if output should be in a format different to UTF-8
|
87
|
// or UCS-4, you will have to convert it before outputting it
|
88
|
echo utf8_decode($output); // This will read: andre@börse.knörz.info
|
89
|
|
90
|
|
91
|
3. The input is read from a UCS-4 coded file and encoded line by line. By
|
92
|
appending the optional second parameter we tell enode() about the input
|
93
|
format to be used
|
94
|
|
95
|
// Include the class
|
96
|
include_once('idna_convert.class.php');
|
97
|
// Instantiate it
|
98
|
$IDN = new dinca_convert();
|
99
|
// Iterate through the input file line by line
|
100
|
foreach (file('ucs4-domains.txt') as $line) {
|
101
|
echo $IDN->encode(trim($line), 'ucs4_string');
|
102
|
echo "\n";
|
103
|
}
|
104
|
|
105
|
|
106
|
NPData
|
107
|
------
|
108
|
|
109
|
Should you need to recreate the npdata.ser file, which holds all necessary translation
|
110
|
tables in a serialized format, you can run the file idna_convert.create.npdata.php, which
|
111
|
creates the file for you and stores it in the same folder, where it is placed.
|
112
|
Should you need to do changes to the tables you can do so, but beware of the consequences.
|
113
|
|
114
|
|
115
|
Contact us
|
116
|
----------
|
117
|
|
118
|
In case of errors, bugs, questions, wishes, please don't hesitate to contact us
|
119
|
under the email address above.
|
120
|
|
121
|
The team of phlyLabs
|
122
|
http://phlylabs.de
|
123
|
mailto:phlymail@phlylabs.de
|