View Issue Details

IDProjectCategoryView StatusLast Update
0004158libextractorlibextractor main librarypublic2016-01-27 12:47
ReporterbudmAssigned ToChristian Grothoff 
PrioritynormalSeverityminorReproducibilityalways
Status feedbackResolutionopen 
PlatformLinuxOSUbuntuOS Version14.04.3 LTS
Product Version1.3 
Target VersionFixed in Version 
Summary0004158: Fails to extract metatada at all depending on plugin initialization sequence
DescriptionTrying to speed up my code, I used the form:

 EXTRACTOR_plugin_add_defaults();
 for (a few thousand files)
   EXTRACTOR_extract();
 EXTRACTOR_plugin_remove_all();

This sequence seems to fail to read metadata from some files (3gp file in this case). SIGPIPE handler is SIG_IGN for this program.

Works reliably if the form is instead:

 for (a few thousand files) {
   EXTRACTOR_plugin_add_defaults();
   EXTRACTOR_extract();
   EXTRACTOR_plugin_remove_all();
  }
Steps To ReproduceUse the first form given in the description and perform repeated runs on a set of files. The easiest way to debug this is to set a breakpoint on a metadata handler that only "fires" for one file out of all files processed.
Additional InformationFailing file is:

duration - 0:00:02.856000000
mimetype - application/x-3gp
unknown - profile=basic
mimetype - audio/mpeg
unknown - mpegversion=4
unknown - framed=true
unknown - stream-format=raw
unknown - level=2
unknown - base-profile=lc
unknown - profile=lc
creation time - 1947-11-28T20:07:40+0100
container format - ISO MP4/M4A
audio codec - MPEG-4 AAC audio
audio language - en
channels - 2
sample rate - 44100
audio depth - 32
audio bitrate - 96000
maximum audio bitrate - 96000
mimetype - video/x-h264
unknown - stream-format=avc
unknown - alignment=au
unknown - level=3.1
unknown - profile=constrained-baseline
creation time - 1947-11-28T20:07:40+0100
container format - ISO MP4/M4A
video codec - H.264 / AVC
video language - en
video dimensions - 544x960
frame rate - 5000/251
pixel aspect ratio - 1/1
video bitrate - 4220708
TagsNo tags attached.

Activities

Christian Grothoff

2016-01-27 12:47

manager   ~0010107

It is likely that this is a bug in one (or more) of the libraries that we use: if, say, gstreamer has an internal hickup that persists in global state after running on some file, this hickup will persist until you unload and reload the plugin. libextractor runs plugins in separate processes, so if you unload/reload it's like you're restarting gstreamer from scratch.

So I'm pretty sure that what you see here is a problem in some library that libextractor uses, not in libextractor itself. You could try loading only one of the plugins to figure out which one is problematic. Then, you can report that to the respective dev team, or at least optimize the performance of your code by only reloading that plugin instead of all plugins.

Issue History

Date Modified Username Field Change
2016-01-26 15:33 budm New Issue
2016-01-27 12:47 Christian Grothoff Note Added: 0010107
2016-01-27 12:47 Christian Grothoff Assigned To => Christian Grothoff
2016-01-27 12:47 Christian Grothoff Status new => feedback