Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
This article is an unpretentious attempt to understand the written data on a DVD. Of course, this text stays on line with the copyright rules.
A quick view of a DVD with Windows Explorer allows you to locate in a subdirectory titled video_ts, and into files with *.IFO, *.DUP, and *.VOB extensions.
- The IFO files give the important data to navigate such as:
- The chapter's structure
- The audio stream
- The DUP files are the backup of the IFO files.
- The VOB files hold the video and audio data with their encrypted system.
The video_ts.ifo and video_ts.vob files represent the first play item when the DVD is set up in a player. Usually, they ensure the display of the copyright warning and the menu for the language's choice.
The other files (vts_01_0.ifo, vts_01_0.vob, and so on) hold the film itself and the bonus features.
The suggested software deals with the data included in the IFO files and recreates the whole film structure with its chapters, its angles, and its cells.
2. IFO file format
On a DVD, the IFO files, as the others DVD files, are a compound of a 2048-byte record. Two formats of IFO files exist, identified by the 12 first bytes of the file:
- "DVDVIDEO-VMG" for the video_ts.ifo file
- "DVDVIDEO-VTS" for the other vts_*.ifo files
The first record hold general information, plus an offset table that defines, in the file, the first record position of the other tables. For information, the detailed structure of the file is given in an appendix (IFOrecord.zip). In the program, there is the CIFO class (source files: IFO.h and IFO.cpp) that provides FIO file reading. The data extraction employs two subroutines:
DWORD get4Bytes(PBYTE pBuffer); WORD get2Bytes(PBYTE pBuffer);
This technique solves the problem of the byte order (big endian to little endian) between the file writing formant and the (x86) machine representation. Only a small part of the data, the displayed part, is extracted from the file.
3. General Code Layout
The data are displayed in a standard MFC-CDialog application. It is generated with the Visual Studio Wizard (source files DVDExplorer.h, DVDExplorer.cpp, DVDExplorerDlg.h, DVDExplorerDlg.cpp, DVDExplorer.rc, Resource.h). The presenting the results is carried out with the Visual resource editor. It comprises mainly TREE_CONTROL and LIST_CONTROL (CTreeCtrl and CListCtrl).
The CMovie, CTitle, CChapter, CCell, CAngle, CUnit, CAdressMap, CTimeMapTable, CUnitManager classes ensure the internal representation of the read data (source files : Movie.h, Movie.cpp, ...). The organization of these data is ensured by class CDVD (DVD.h, DVD.cpp). This class also provides the posting interface with the CDVDExpDlg (DVDExplorerDlg.h/cpp) class.
Cutting in streams is ensured by the CStream, CAudio, CSubPicture, CAudioStream, CSubPictureStream, CStreamProcessing (Stream.h, Stream.cpp, ...) classes.
A CDirPickDlg (DirPickDlg.h, DirPickDlg.cpp) class manages the acquisition of a film folder to read. It is used to work the CDirTreeDlg class, derived from CTreeCtrl (DirTreeDlg.h, DirTreeDlg.cpp).
Trace in debug mode and a special error's display allows a highlight of the record format mistakes. The derived CDialog class CErrorDlg provides this last function (DirPickDlg.h, DirPickDlg.cpp).