View Issue Details

IDProjectCategoryView StatusLast Update
0002033libextractorlibextractor main librarypublic2012-09-25 17:18
Reporterdad Assigned ToLRN  
PrioritynormalSeverityfeatureReproducibilityN/A
Status closedResolutionfixed 
Product Version0.6.3 
Target Version1.0.0Fixed in Version1.0.0 
Summary0002033: A matroska audio for test purpose
DescriptionI include a simple matroska audio with tags for tests.

The tags are not reported, and on the attached sample file, some informations are errornous.

It's Richard Stallman singing the Free Software Song (https://www.gnu.org/music/free-software-song.html)

Additional InformationThe extract command output:

Keywords for file Free Software Song - Richard Stallman.mka:
mimetype - video/mkv
title - Free Software Song
duration - 0 s (audio)
format - A_VORBIS
mimetype - audio/mpeg
format version - MPEG-1
resource type - MPEG-1 Layer II audio, 0 kbps (CBR), 0 Hz, dual channel, no copyright, copy
duration - 0m00


mkvinfo output:

+ EBML head
|+ EBML version: 1
|+ EBML read version: 1
|+ EBML maximum ID length: 4
|+ EBML maximum size length: 8
|+ Doc type: matroska
|+ Doc type version: 2
|+ Doc type read version: 2
+ Segment, size 335076
|+ Seek head (subentries will be skipped)
|+ EbmlVoid (size: 4029)
|+ Segment information
| + Timecode scale: 124999
| + Muxing application: libebml v1.2.2 + libmatroska v1.3.0
| + Writing application: mkvmerge v5.1.0 ('And so it goes') built on Nov 29 2011 14:12:21
| + Duration: 108.208s (00:01:48.208)
| + Date: Fri Dec 16 20:55:10 2011 UTC
| + Title: Free Software Song
| + Segment UID: 0x11 0x87 0x07 0xa6 0x57 0x1e 0x40 0x5a 0x47 0x52 0x5d 0x83 0x7c 0x06 0x6f 0xd5 
|+ Segment tracks
| + A track
|  + Track number: 1
|  + Track UID: 3087006734
|  + Track type: audio
|  + Codec ID: A_VORBIS
|  + CodecPrivate, length 2554
|  + Audio track
|+ EbmlVoid (size: 1072)
|+ Cluster


Here is the extracted tags from:

mkvextract tags Free\ Software\ Song\ -\ Richard\ Stallman.mka

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE Tags SYSTEM "matroskatags.dtd">

<Tags>
  <Tag>
    <Targets>
      <TargetType>COLLECTION</TargetType>
      <TargetTypeValue>70</TargetTypeValue>
    </Targets>
    <Simple>
      <Name>TITLE</Name>
      <String>Free Software Song</String>
      <TagLanguage>eng</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>TITLE</Name>
      <String>Free Software Song</String>
      <TagLanguage>fre</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>CONTENT_TYPE</Name>
      <String>music</String>
      <TagLanguage>eng</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>CONTENT_TYPE</Name>
      <String>musique</String>
      <TagLanguage>fre</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>COMPOSER</Name>
      <String>Richard Matthew Stallman</String>
      <TagLanguage>und</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>DISTRIBUTED_BY</Name>
      <String>Free Software Foundation</String>
      <TagLanguage>eng</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>DISTRIBUTED_BY</Name>
      <String>Free Software Foundation</String>
      <TagLanguage>fre</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>URL</Name>
      <String>https://www.gnu.org/music/free-software-song.html</String>
      <TagLanguage>eng</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>URL</Name>
      <String>https://www.gnu.org/music/free-software-song.fr.html</String>
      <TagLanguage>fre</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
  </Tag>
  <Tag>
    <Targets>
      <TargetType>TRACK</TargetType>
      <TargetTypeValue>30</TargetTypeValue>
    </Targets>
    <Simple>
      <Name>TITLE</Name>
      <String>Free Software Song</String>
      <TagLanguage>eng</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>TITLE</Name>
      <String>Free Software Song</String>
      <TagLanguage>fre</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>URL</Name>
      <String>https://www.gnu.org/music/free-software-song.ogg</String>
      <TagLanguage>und</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>ARTIST</Name>
      <String>Richard Matthew Stallman</String>
      <TagLanguage>und</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
      <Simple>
        <Name>URL</Name>
        <String>http://www.stallman.org/</String>
        <TagLanguage>und</TagLanguage>
        <DefaultLanguage>1</DefaultLanguage>
      </Simple>
      <Simple>
        <Name>INSTRUMENTS</Name>
        <String>voice</String>
        <TagLanguage>eng</TagLanguage>
        <DefaultLanguage>1</DefaultLanguage>
      </Simple>
      <Simple>
        <Name>INSTRUMENTS</Name>
        <String>voix</String>
        <TagLanguage>fre</TagLanguage>
        <DefaultLanguage>1</DefaultLanguage>
      </Simple>
    </Simple>
  </Tag>
</Tags>
TagsNo tags attached.
Attached Files

Activities

LRN

2011-12-24 15:03

developer   ~0005189

This is partially because of a libextractor mp3-related bug that was recently fixed. Try an svn version of libextractor.

Also, AFAIR, libextractor has no MKV support (patches are welcome!)

LRN

2011-12-24 15:11

developer   ~0005191

I stand corrected, it does have MKV support. Apparently, it's somewhat broken (patches are welcome!).

LRN

2012-03-20 07:00

developer   ~0005639

With mp3 false-positives fixed and with the latest version of libextractor from SVN (with one patch on top of it, that patch is queued for committing) i get:
Keywords for file Free Software Song - Richard Stallman.mka:
mimetype - video/mkv
title - Free Software Song
duration - 0 s (audio)
format - A_VORBIS

That is, erroneous information is not present. Now what remains is to add support for various tags...

Note however, that while there ARE guidelines for element order in Matroska files, they are mostly non-mandatory. Meaning that the data might be in ANY order, and it is not guaranteed that interesting metadata will be located at the beginning or at the end of a file. Ideally, an mkv information extractor should read through the whole file, jumping over uninteresting elements, and reading the interesting ones.

So, right now we need a map between elements ( http://matroska.org/technical/specs/index.html ) and Extractor tags ( http://fossies.org/unix/privat/libextractor-0.6.3.tar.gz/dox/extractor_8h.html#ae9bcf4746a2cb06159db2c63ad91bb55 ). But bear in mind that even when mapped correctly, the interesting elements might simply not be at the beginning of the file (which is where mkv extractor is looking for them).

LRN

2012-03-25 20:19

developer   ~0005656

With new reading-through-file libextractor architecture and a new EBML (Matroska/WebM) plugin, i get:
Keywords for file Free Software Song - Richard Stallman.mka:
format version - 1
resource type - matroska 2 (EBML 1)
duration - 108s
title - Free Software Song
created by software - Written with mkvmerge v5.1.0 ('And so it goes') built on Nov 29 2011 14:12:21, muxed with libebml v1.2.2 + libmatroska v1.3.0
resource type - audio track (A_VORBIS, 1-channel at 8000Hz) [eng]
title - Free Software Song
title - Free Software Song
unknown - CONTENT_TYPE=music
unknown - CONTENT_TYPE=musique
composer - Richard Matthew Stallman
unknown - DISTRIBUTED_BY=Free Software Foundation
unknown - DISTRIBUTED_BY=Free Software Foundation
URL - https://www.gnu.org/music/free-software-song.html
URL - https://www.gnu.org/music/free-software-song.fr.html
title - Free Software Song
title - Free Software Song
URL - https://www.gnu.org/music/free-software-song.ogg
artist - Richard Matthew Stallman
URL - http://www.stallman.org/
unknown - INSTRUMENTS=voice
unknown - INSTRUMENTS=voix

Christian Grothoff

2012-08-24 23:20

manager   ~0006282

Last edited: 2012-08-24 23:21

SVN HEAD now gives:

Keywords for file test/Free Software Song - Richard Stallman.mka:
mimetype - application/octet-stream
duration - 0:01:48.208009329
mimetype - audio/x-matroska
mimetype - audio/x-vorbis
title - Free Software Song
composer - Richard Matthew Stallman
artist - Richard Matthew Stallman
container format - Matroska
encoder - Xiph.Org libVorbis I 20020717
encoder version - 0
audio codec - Vorbis
audio language - en
channels - 1
sample rate - 8000
audio bitrate - 22400

I think this is quite satisfactory, but obviously has room for improvement.

LRN

2012-08-25 02:14

developer   ~0006287

With SVN HEAD + my Matroska patches to gst-plugins-good [1] [2] [3] [4] i get this:

duration - 0:01:48.208009329
mimetype - audio/x-matroska
mimetype - audio/x-vorbis
title - Free Software Song
album - Free Software Song
unknown - CONTENT_TYPE=music
unknown - CONTENT_TYPE=musique
unknown - DISTRIBUTED_BY=Free Software Foundation
unknown - URL=https://www.gnu.org/music/free-software-song.html
unknown - URL=https://www.gnu.org/music/free-software-song.fr.html
unknown - URL=https://www.gnu.org/music/free-software-song.ogg
unknown - ARTIST/INSTRUMENTS=voice
unknown - ARTIST/INSTRUMENTS=voix
composer - Richard Matthew Stallman
artist - Richard Matthew Stallman
URL - http://www.stallman.org/
container format - Matroska
encoder - Xiph.Org libVorbis I 20020717
encoder version - 0
audio codec - Vorbis
audio language - en
channels - 1
sample rate - 8000
audio bitrate - 22400

[1] https://bugzilla.gnome.org/show_bug.cgi?id=682448
[2] https://bugzilla.gnome.org/show_bug.cgi?id=682524
[3] https://bugzilla.gnome.org/show_bug.cgi?id=682615
[4] https://bugzilla.gnome.org/show_bug.cgi?id=682644

Issue History

Date Modified Username Field Change
2011-12-24 12:04 dad New Issue
2011-12-24 12:04 dad File Added: Free Software Song - Richard Stallman.mka
2011-12-24 15:03 LRN Note Added: 0005189
2011-12-24 15:11 LRN Note Added: 0005191
2012-01-05 22:08 Christian Grothoff Target Version => 1.0.0
2012-01-05 22:08 Christian Grothoff Severity feature => major
2012-01-05 22:08 Christian Grothoff Status new => confirmed
2012-03-20 07:00 LRN Note Added: 0005639
2012-03-25 20:19 LRN Note Added: 0005656
2012-08-24 23:18 Christian Grothoff Target Version 1.0.0 =>
2012-08-24 23:20 Christian Grothoff Note Added: 0006282
2012-08-24 23:21 Christian Grothoff Note Edited: 0006282
2012-08-24 23:21 Christian Grothoff Severity major => feature
2012-08-25 02:14 LRN Note Added: 0006287
2012-09-25 17:17 Christian Grothoff Status confirmed => resolved
2012-09-25 17:17 Christian Grothoff Fixed in Version => 1.0.0
2012-09-25 17:17 Christian Grothoff Resolution open => fixed
2012-09-25 17:17 Christian Grothoff Assigned To => Christian Grothoff
2012-09-25 17:18 Christian Grothoff Assigned To Christian Grothoff => LRN
2012-09-25 17:18 Christian Grothoff Target Version => 1.0.0
2012-09-25 17:18 Christian Grothoff Status resolved => closed