This is a short tutorial on using the pyid3lib module. For instructions on installing it, see the README file included in the distribution.
tag
function to create a tag
object for a given MP3 file.
>>> import pyid3lib >>> x = pyid3lib.tag( 'track01.mp3' ) >>> x <pyid3lib.tag object at 0x8155b70> >>>You can read and change the data contained in this object. To write any changes back into the original MP3 file, use the
update()
method.
>>> x.update()
>>>
There are two ways to access the data: the basic way, and the
advanced way.
>>> x = pyid3lib.tag( 'track01.mp3' ) >>> x.artist 'Aphex Twin' >>> x.title 'Jynweythek' >>> x.year 2001 >>> x.track (1, 15) >>>You can also assign new values to them:
>>> x.artist = 'Meat Beat Manifesto'
>>>
Don't forget that you have to call update()
to actually
write changes out to the file!
Most attributes require a string value, or ID3Error
is raised:
>>> x.artist = 12 Traceback (most recent call last): File "Deleting an attribute (with "", line 1, in ? pyid3lib.ID3Error: 'artist' attribute must be string >>>
del x.attr
") or assigning
None
to it deletes the corresponding piece of tag
data.Here are all the attributes that take strings:
album | artist | band |
bpm | composer | conductor |
contentgroup | contenttype | copyright |
date | encodedby | encodersettings |
fileowner | filetype | initialkey |
involvedpeople | isrc | language |
leadartist | lyricist | mediatype |
mixartist | netradioowner | netradiostation |
origalbum | origartist | origfilename |
origlyricist | origyear | playlistdelay |
publisher | recordingdates | size |
songlen | subtitle | time |
title | wwwartist | wwwaudiofile |
wwwaudiosource | wwwcommercialinfo | wwwcopyright |
wwwpayment | wwwpublisher | wwwradiopage |
Note that artist
is a synonym for leadartist
the two attributes modify the same underlying piece of data.
There are a few other attributes that get special processing.
"year
" is treated as an int, even though it is stored in
the tag as a string. You can assign either a string or an int to it,
but its value will always be returned as an int.
"tracknum
" and "partinset
" can be either a
1- or 2-tuple of ints. Setting tracknum
to
(4,17)
indicates that this is track 4 of 17 on the
original album. partinset
functions similarly, when
album is divided into several chunks of media (e.g., a double-CD
album). "track
" is a synonym for
"tracknum
".
Here are some examples of using the track
attribute:
>>> x.track = '4' # all these are equivalent ways >>> x.track = 4 # of saying "track 4" >>> x.track = (4,) >>> >>> x.track = (4,17) # these two are equivalent ways >>> x.track = '4/17' # of saying "track 4 of 17" >>> >>> x.track = 9 # no matter how it is set, the value >>> x.track # is returned as a 1- or 2-tuple of ints. (9,) >>> x.track = '10/12' >>> x.track (10, 12) >>>
To get at all the data, you have to use a different access method. First, I'll give a very brief introduction to how ID3v2 tags are structured (version 2.3 and higher, at least).
An ID3 tag consists of a header, plus one or more frames. Each frame has a four-character frame ID identifying what's stored in that frame, plus some data. There are a bunch of standardized frame IDs defined in the standard at id3.org. For instance, the "TALB" frame is used to store the name of the album that the track came from. The "TPE1" frame stores the name of the artist, and so on.
pyid3lib "tag" objects support Python's sequencing and iteration protocols. Accessing an item of this sequence gives you a dictionary with the contents of the corresponding frame. For instance:
>>> x = pyid3lib.tag( 'track01.mp3' ) >>> for i in x: print i # iterate over all frames, printing them out ... {'text': 'Aphex Twin', 'textenc': 0, 'frameid': 'TPE1'} {'text': 'Drukqs [1/2]', 'textenc': 0, 'frameid': 'TALB'} {'text': '1/2', 'textenc': 0, 'frameid': 'TPOS'} {'text': '2001', 'textenc': 0, 'frameid': 'TYER'} {'text': 'Jynweythek', 'textenc': 0, 'frameid': 'TIT2'} {'text': '1/15', 'textenc': 0, 'frameid': 'TRCK'} {'text': '(26)', 'textenc': 0, 'frameid': 'TCON'} {'text': '143386', 'textenc': 0, 'frameid': 'TLEN'} >>> x[4] # access a single frame {'text': 'Jynweythek', 'textenc': 0, 'frameid': 'TIT2'} >>> x[:2] # access a slice of frames [{'text': 'Aphex Twin', 'textenc': 0, 'frameid': 'TPE1'}, {'text': 'Drukqs [1/2]', 'textenc': 0, 'frameid': 'TALB'}] >>>
You can modify the tag in all the usual ways you can manipulate a
list: assign to an element or slice, or via the append
,
extend
, insert
, pop
, and
remove
methods. In each case the thing you put into the
tag must be a dictionary, and the dictionary must contain a
'frameid
' key whose value is a legal frame ID. (Of
course, extend
and slice assignment both require a
sequence of legal dictionaries.)
>>> d = { 'frameid' : 'TPE1', 'text' : 'New Artist Name' } >>> x[0] = d >>> x.pop() {'text': '143386', 'textenc': 0, 'frameid': 'TLEN'} >>> [i['frameid'] for i in x] ['TPE1', 'TALB', 'TPOS', 'TYER', 'TIT2', 'TRCK', 'TCON'] >>>The methods
index
and remove
, which search
the sequence for a value, take a frame id string as argument.
>>> i = x.index( 'TIT2' ) >>> print i 4 >>> x[i] {'text': 'Jynweythek', 'textenc': 0, 'frameid': 'TIT2'} >>>It's important to remember that the dictionaries you get out of a tag object are merely copies of the frame data modifying the dictionary does not modify the tag! To change the tag, you have to explicitly assign back into it. For instance:
>>> x.title # here is the track's title 'Jynweythek' >>> d = x[x.index('TIT2')] # access the corresponding frame >>> d {'text': 'Jynweythek', 'textenc': 0, 'frameid': 'TIT2'} >>> d['text'] = 'New Title' # modify the returned dictionary >>> x.title # see? the tag data hasn't changed. 'Jynweythek' >>> x[x.index('TIT2')] = d # set the frame based on the modified dictionary >>> x.title # now the tag data reflects the change. 'New Title' >>>Modifying the tag through attributes works on exactly the same data as modifying it through the sequence operations. The attributes are provided simply for convenience; it's easier to remember names like "artist" than sometimes-cryptic frame IDs like "TPE1".
Setting the value of an attribute will first go through and delete all frames of the corresponding frame ID, then append a new frame with the new value. So saying:
x.artist = 'Aphex Twin'is roughly equivalent to:
try: while 1: x.remove( 'TPE1' ) except ValueError: pass x.append( { 'frameid' : 'TPE1', 'text' : 'Aphex Twin' } )
>>> pyid3lib.query( 'TALB' ) (24, 'Album/Movie/Show title', ('textenc', 'text')) >>> pyid3lib.query( 'WOAR' ) (69, 'Official artist/performer webpage', ('url',)) >>> pyid3lib.query( 'APIC' ) (2, 'Attached picture', ('textenc', 'imageformat', 'mimetype', 'picturetype', 'description', 'data')) >>> pyid3lib.query( 'QQQQ' ) Traceback (most recent call last): File "The return value from", line 1, in ? pyid3lib.ID3Error: frame ID 'QQQQ' is not supported by id3lib >>>
query
has three values. The first
can be ignored; it's used internally. The second is a string with a
brief description of that frame's purpose. The third is a tuple of
strings with the names of individual fields of that frame. Hopefully
many of these will be self-explanatory, for more information you could
look at the standard.
>>> f = open( 'pic.jpg', 'rb' ) >>> d = { 'frameid' : 'APIC', 'mimetype' : 'image/jpeg', ... 'description' : 'A pretty picture.', ... 'picturetype' : 3, ... 'data' : f.read() } >>> f.close() >>> x.append( d ) >>>See? Nothing to it. The value 3 that was assigned to 'picturetype' identifies the picture as the front of the album cover. For a list of all the picturetypes, see section 4.14 of the standard.
To extract an embedded picture, you do pretty much the opposite thing:
>>> d = x[x.index('APIC')] # this finds the first embedded picture in the tag >>> d['mimetype'] 'image/jpeg' >>> f = open( 'output.jpg', 'wb' ) >>> f.write( d['data'] ) >>> f.close() >>>
For maximum compatibility, you should limit your pictures to JPEGs and PNGs (mimetypes "image/jpeg" and "image/png", respectively). Most software that reads picture tags will be able to support at least these two image formats (and your software should, too!)
>>> x = pyid3lib.tag( 'song.mp3' ) >>> x.update()
causes unknown frames to be stripped from the tag. This is a limitation of the underlying id3lib library.