You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[**CodExt**](https://github.com/dhondta/python-codext) is a (Python2-3 compatible) library that extends the native [`codecs`](https://docs.python.org/3/library/codecs.html) library (namely for adding new custom encodings and character mappings) and provides **120+ new codecs**, hence its name combining *CODecs EXTension*. It also features a **guess mode** for decoding multiple layers of encoding and **CLI tools** for convenience.
14
+
[**CodExt**](https://github.com/dhondta/python-codext) is a (Python2-3 compatible) library that extends the native [`codecs`](https://docs.python.org/3/library/codecs) library (namely for adding new custom encodings and character mappings) and provides **120+ new codecs**, hence its name combining *CODecs EXTension*. It also features a **guess mode** for decoding multiple layers of encoding and **CLI tools** for convenience.
15
15
16
16
```sh
17
17
$ pip install codext
18
18
```
19
19
20
20
Want to contribute a new codec ? | Want to contribute a new macro ?
Check the [documentation](https://python-codext.readthedocs.io/en/latest/howto.html) first<br>Then [PR](https://github.com/dhondta/python-codext/pulls) your new codec | [PR](https://github.com/dhondta/python-codext/pulls) your updated version of [`macros.json`](https://github.com/dhondta/python-codext/blob/main/codext/macros.json)
22
+
Check the [documentation](https://python-codext.readthedocs.io/en/latest/howto) first<br>Then [PR](https://github.com/dhondta/python-codext/pulls) your new codec | [PR](https://github.com/dhondta/python-codext/pulls) your updated version of [`macros.json`](https://github.com/dhondta/python-codext/blob/main/codext/macros.json)
-[X]`base1`: useless, but for the sake of completeness
216
216
-[X]`base2`: simple conversion to binary (with a variant with a reversed alphabet)
@@ -221,7 +221,7 @@ o
221
221
-[X]`base11`: conversion to digits with a "*a*"
222
222
-[X]`base16`: simple conversion to hexadecimal (with a variant holding an alphabet with digits and letters inverted)
223
223
-[X]`base26`: conversion to alphabet letters
224
-
-[X]`base32`: classical conversion according to the RFC4648 with all its variants ([zbase32](https://philzimmermann.com/docs/human-oriented-base-32-encoding.txt), extended hexadecimal, [geohash](https://en.wikipedia.org/wiki/Geohash), [Crockford](https://www.crockford.com/base32.html))
224
+
-[X]`base32`: classical conversion according to the RFC4648 with all its variants ([zbase32](https://philzimmermann.com/docs/human-oriented-base-32-encoding.txt), extended hexadecimal, [geohash](https://en.wikipedia.org/wiki/Geohash), [Crockford](https://www.crockford.com/base32))
225
225
-[X]`base36`: [Base36](https://en.wikipedia.org/wiki/Base36) conversion to letters and digits (with a variant inverting both groups)
226
226
-[X]`base45`: [Base45](https://datatracker.ietf.org/doc/html/draft-faltstrom-base45-04.txt) DRAFT algorithm (with a variant inverting letters and digits)
227
227
-[X]`base58`: multiple versions of [Base58](https://en.bitcoinwiki.org/wiki/Base58) (bitcoin, flickr, ripple)
-[X]`gzip`: standard Gzip compression/decompression
267
272
-[X]`lz77`: compresses the given data with the algorithm of Lempel and Ziv of 1977
@@ -272,7 +277,7 @@ This category also contains `ascii85`, `adobe`, `[x]btoa`, `zeromq` with the `ba
272
277
273
278
> :warning: Compression functions are of course definitely **NOT** encoding functions ; they are implemented for leveraging the `.encode(...)` API from `codecs`.
@@ -287,18 +292,17 @@ This category also contains `ascii85`, `adobe`, `[x]btoa`, `zeromq` with the `ba
287
292
288
293
> :warning: Crypto functions are of course definitely **NOT** encoding functions ; they are implemented for leveraging the `.encode(...)` API from `codecs`.
-[X]`blake`: includes BLAKE2b and BLAKE2s (Python 3 only ; relies on `hashlib`)
293
-
-[X]`checksums`: includes Adler32 and CRC32 (relies on `zlib`)
294
298
-[X]`crypt`: Unix's crypt hash for passwords (Python 3 and Unix only ; relies on `crypt`)
295
299
-[X]`md`: aka Message Digest ; includes MD4 and MD5 (relies on `hashlib`)
296
300
-[X]`sha`: aka Secure Hash Algorithms ; includes SHA1, 224, 256, 384, 512 (Python2/3) but also SHA3-224, -256, -384 and -512 (Python 3 only ; relies on `hashlib`)
297
301
-[X]`shake`: aka SHAKE hashing (Python 3 only ; relies on `hashlib`)
298
302
299
303
> :warning: Hash functions are of course definitely **NOT** encoding functions ; they are implemented for convenience with the `.encode(...)` API from `codecs` and useful for chaning codecs.
-[X]`hexagram`: uses Base64 and encodes the result to a charset of [I Ching hexagrams](https://en.wikipedia.org/wiki/Hexagram_%28I_Ching%29) (as implemented [here](https://github.com/qntm/hexagram-encode))
324
328
-[X]`klopf`: aka Klopf code ; Polybius square with trivial alphabetical distribution
@@ -328,7 +332,7 @@ This category also contains `ascii85`, `adobe`, `[x]btoa`, `zeromq` with the `ba
328
332
-[X]`whitespace`: replaces bits with whitespaces and tabs
329
333
-[X]`whitespace_after_before`: variant of `whitespace` ; encodes characters as new characters with whitespaces before and after according to an equation described in the codec name (e.g. "`whitespace+2*after-3*before`")
All kinds of other codecs are categorized in "*Others*".
4
-
5
-
-----
6
-
7
-
### DNA
8
-
9
-
This implements the 8 methods of ATGC nucleotides following the rule of complementary pairing, according the literature about coding and computing of DNA sequences.
`consonant-indices` | text <-> text with consonant indices | `consonants_indices`, `consonants_index` | while decoding, searches from the longest match, possibly not producing the original input
41
-
`vowel-indices` | text <-> text with vowel indices | `vowels_indices`, `vowels_index` |
42
-
`consonant-vowel-indices` | text <-> text with consonant and vowel indices | `consonants-vowels_index` | prefixes consonants with `C` and vowels with `V`
43
-
44
-
```python
45
-
>>> codext.encode("This is a test", "consonant-index")
All kinds of other codecs are categorized in "*Others*".
4
+
5
+
-----
6
+
7
+
### DNA
8
+
9
+
This implements the 8 methods of ATGC nucleotides following the rule of complementary pairing, according the literature about coding and computing of DNA sequences.
`consonant-indices` | text <-> text with consonant indices | `consonants_indices`, `consonants_index` | while decoding, searches from the longest match, possibly not producing the original input
41
+
`vowel-indices` | text <-> text with vowel indices | `vowels_indices`, `vowels_index` |
42
+
`consonant-vowel-indices` | text <-> text with consonant and vowel indices | `consonants-vowels_index` | prefixes consonants with `C` and vowels with `V`
43
+
44
+
```python
45
+
>>> codext.encode("This is a test", "consonant-index")
0 commit comments