org.eclipse.jgit.transport
Class PackParser

java.lang.Object
  extended by org.eclipse.jgit.transport.PackParser
Direct Known Subclasses:
DfsPackParser, ObjectDirectoryPackParser

public abstract class PackParser
extends Object

Parses a pack stream and imports it for an ObjectInserter.

Applications can acquire an instance of a parser from ObjectInserter's ObjectInserter.newPackParser(InputStream) method.

Implementations of ObjectInserter should subclass this type and provide their own logic for the various on*() event methods declared to be abstract.


Nested Class Summary
static class PackParser.ObjectTypeAndSize
          Type and size information about an object in the database buffer.
static class PackParser.Source
          Location data is being obtained from.
static class PackParser.UnresolvedDelta
          Information about an unresolved delta in this pack stream.
 
Constructor Summary
protected PackParser(ObjectDatabase odb, InputStream src)
          Initialize a pack parser.
 
Method Summary
protected  byte[] buffer()
           
protected abstract  boolean checkCRC(int oldCRC)
          Check the current CRC matches the expected value.
 ObjectIdSubclassMap<ObjectId> getBaseObjectIds()
           
 String getLockMessage()
           
 ObjectIdSubclassMap<ObjectId> getNewObjectIds()
           
 PackedObjectInfo getObject(int nth)
          Get the information about the requested object.
 int getObjectCount()
          Get the number of objects in the stream.
 List<PackedObjectInfo> getSortedObjectList(Comparator<PackedObjectInfo> cmp)
          Get all of the objects, sorted by their name.
 boolean isAllowThin()
           
 boolean isCheckEofAfterPackFooter()
           
 boolean isExpectDataAfterPackFooter()
           
protected  PackedObjectInfo newInfo(AnyObjectId id, PackParser.UnresolvedDelta delta, ObjectId deltaBase)
          Construct a PackedObjectInfo instance for this parser.
protected abstract  boolean onAppendBase(int typeCode, byte[] data, PackedObjectInfo info)
          Provide the implementation with a base that was outside of the pack.
protected abstract  void onBeginOfsDelta(long deltaStreamPosition, long baseStreamPosition, long inflatedSize)
          Event notifying start of a delta referencing its base by offset.
protected abstract  void onBeginRefDelta(long deltaStreamPosition, AnyObjectId baseId, long inflatedSize)
          Event notifying start of a delta referencing its base by ObjectId.
protected abstract  void onBeginWholeObject(long streamPosition, int type, long inflatedSize)
          Event notifying the start of an object stored whole (not as a delta).
protected  PackParser.UnresolvedDelta onEndDelta()
          Event notifying the the current object.
protected abstract  void onEndThinPack()
          Event indicating a thin pack has been completely processed.
protected abstract  void onEndWholeObject(PackedObjectInfo info)
          Event notifying the the current object.
protected abstract  void onInflatedObjectData(PackedObjectInfo obj, int typeCode, byte[] data)
          Invoked for commits, trees, tags, and small blobs.
protected abstract  void onObjectData(PackParser.Source src, byte[] raw, int pos, int len)
          Store (and/or checksum) a portion of an object's data.
protected abstract  void onObjectHeader(PackParser.Source src, byte[] raw, int pos, int len)
          Store (and/or checksum) an object header.
protected abstract  void onPackFooter(byte[] hash)
          Provide the implementation with the original stream's pack footer.
protected abstract  void onPackHeader(long objCnt)
          Provide the implementation with the original stream's pack header.
protected abstract  void onStoreStream(byte[] raw, int pos, int len)
          Store bytes received from the raw stream.
 PackLock parse(ProgressMonitor progress)
          Parse the pack stream.
 PackLock parse(ProgressMonitor receiving, ProgressMonitor resolving)
          Parse the pack stream.
protected abstract  int readDatabase(byte[] dst, int pos, int cnt)
          Read from the database's current position into the buffer.
protected  PackParser.ObjectTypeAndSize readObjectHeader(PackParser.ObjectTypeAndSize info)
          Read the header of the current object.
protected abstract  PackParser.ObjectTypeAndSize seekDatabase(PackedObjectInfo obj, PackParser.ObjectTypeAndSize info)
          Reposition the database to re-read a previously stored object.
protected abstract  PackParser.ObjectTypeAndSize seekDatabase(PackParser.UnresolvedDelta delta, PackParser.ObjectTypeAndSize info)
          Reposition the database to re-read a previously stored object.
 void setAllowThin(boolean allow)
          Configure this index pack instance to allow a thin pack.
 void setCheckEofAfterPackFooter(boolean b)
          Ensure EOF is read from the input stream after the footer.
 void setExpectDataAfterPackFooter(boolean e)
           
 void setLockMessage(String msg)
          Set the lock message for the incoming pack data.
 void setMaxObjectSizeLimit(long limit)
          Set the maximum allowed Git object size.
 void setNeedBaseObjectIds(boolean b)
          Configure this index pack instance to keep track of the objects assumed for delta bases.
 void setNeedNewObjectIds(boolean b)
          Configure this index pack instance to keep track of new objects.
 void setObjectChecker(ObjectChecker oc)
          Configure the checker used to validate received objects.
 void setObjectChecking(boolean on)
          Configure the checker used to validate received objects.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PackParser

protected PackParser(ObjectDatabase odb,
                     InputStream src)
Initialize a pack parser.

Parameters:
odb - database the parser will write its objects into.
src - the stream the parser will read.
Method Detail

isAllowThin

public boolean isAllowThin()
Returns:
true if a thin pack (missing base objects) is permitted.

setAllowThin

public void setAllowThin(boolean allow)
Configure this index pack instance to allow a thin pack.

Thin packs are sometimes used during network transfers to allow a delta to be sent without a base object. Such packs are not permitted on disk.

Parameters:
allow - true to enable a thin pack.

setNeedNewObjectIds

public void setNeedNewObjectIds(boolean b)
Configure this index pack instance to keep track of new objects.

By default an index pack doesn't save the new objects that were created when it was instantiated. Setting this flag to true allows the caller to use getNewObjectIds() to retrieve that list.

Parameters:
b - true to enable keeping track of new objects.

setNeedBaseObjectIds

public void setNeedBaseObjectIds(boolean b)
Configure this index pack instance to keep track of the objects assumed for delta bases.

By default an index pack doesn't save the objects that were used as delta bases. Setting this flag to true will allow the caller to use getBaseObjectIds() to retrieve that list.

Parameters:
b - true to enable keeping track of delta bases.

isCheckEofAfterPackFooter

public boolean isCheckEofAfterPackFooter()
Returns:
true if the EOF should be read from the input after the footer.

setCheckEofAfterPackFooter

public void setCheckEofAfterPackFooter(boolean b)
Ensure EOF is read from the input stream after the footer.

Parameters:
b - true if the EOF should be read; false if it is not checked.

isExpectDataAfterPackFooter

public boolean isExpectDataAfterPackFooter()
Returns:
true if there is data expected after the pack footer.

setExpectDataAfterPackFooter

public void setExpectDataAfterPackFooter(boolean e)
Parameters:
e - true if there is additional data in InputStream after pack. This requires the InputStream to support the mark and reset functions.

getNewObjectIds

public ObjectIdSubclassMap<ObjectId> getNewObjectIds()
Returns:
the new objects that were sent by the user

getBaseObjectIds

public ObjectIdSubclassMap<ObjectId> getBaseObjectIds()
Returns:
set of objects the incoming pack assumed for delta purposes

setObjectChecker

public void setObjectChecker(ObjectChecker oc)
Configure the checker used to validate received objects.

Usually object checking isn't necessary, as Git implementations only create valid objects in pack files. However, additional checking may be useful if processing data from an untrusted source.

Parameters:
oc - the checker instance; null to disable object checking.

setObjectChecking

public void setObjectChecking(boolean on)
Configure the checker used to validate received objects.

Usually object checking isn't necessary, as Git implementations only create valid objects in pack files. However, additional checking may be useful if processing data from an untrusted source.

This is shorthand for:

 setObjectChecker(on ? new ObjectChecker() : null);
 

Parameters:
on - true to enable the default checker; false to disable it.

getLockMessage

public String getLockMessage()
Returns:
the message to record with the pack lock.

setLockMessage

public void setLockMessage(String msg)
Set the lock message for the incoming pack data.

Parameters:
msg - if not null, the message to associate with the incoming data while it is locked to prevent garbage collection.

setMaxObjectSizeLimit

public void setMaxObjectSizeLimit(long limit)
Set the maximum allowed Git object size.

If an object is larger than the given size the pack-parsing will throw an exception aborting the parsing.

Parameters:
limit - the Git object size limit. If zero then there is not limit.

getObjectCount

public int getObjectCount()
Get the number of objects in the stream.

The object count is only available after parse(ProgressMonitor) has returned. The count may have been increased if the stream was a thin pack, and missing bases objects were appending onto it by the subclass.

Returns:
number of objects parsed out of the stream.

getObject

public PackedObjectInfo getObject(int nth)
Get the information about the requested object.

The object information is only available after parse(ProgressMonitor) has returned.

Parameters:
nth - index of the object in the stream. Must be between 0 and getObjectCount()-1.
Returns:
the object information.

getSortedObjectList

public List<PackedObjectInfo> getSortedObjectList(Comparator<PackedObjectInfo> cmp)
Get all of the objects, sorted by their name.

The object information is only available after parse(ProgressMonitor) has returned.

To maintain lower memory usage and good runtime performance, this method sorts the objects in-place and therefore impacts the ordering presented by getObject(int).

Parameters:
cmp - comparison function, if null objects are stored by ObjectId.
Returns:
sorted list of objects in this pack stream.

parse

public final PackLock parse(ProgressMonitor progress)
                     throws IOException
Parse the pack stream.

Parameters:
progress - callback to provide progress feedback during parsing. If null, NullProgressMonitor will be used.
Returns:
the pack lock, if one was requested by setting setLockMessage(String).
Throws:
IOException - the stream is malformed, or contains corrupt objects.

parse

public PackLock parse(ProgressMonitor receiving,
                      ProgressMonitor resolving)
               throws IOException
Parse the pack stream.

Parameters:
receiving - receives progress feedback during the initial receiving objects phase. If null, NullProgressMonitor will be used.
resolving - receives progress feedback during the resolving objects phase.
Returns:
the pack lock, if one was requested by setting setLockMessage(String).
Throws:
IOException - the stream is malformed, or contains corrupt objects.

readObjectHeader

protected PackParser.ObjectTypeAndSize readObjectHeader(PackParser.ObjectTypeAndSize info)
                                                 throws IOException
Read the header of the current object.

After the header has been parsed, this method automatically invokes onObjectHeader(Source, byte[], int, int) to allow the implementation to update its internal checksums for the bytes read.

When this method returns the database will be positioned on the first byte of the deflated data stream.

Parameters:
info - the info object to populate.
Returns:
info, after populating.
Throws:
IOException - the size cannot be read.

buffer

protected byte[] buffer()
Returns:
a temporary byte array for use by the caller.

newInfo

protected PackedObjectInfo newInfo(AnyObjectId id,
                                   PackParser.UnresolvedDelta delta,
                                   ObjectId deltaBase)
Construct a PackedObjectInfo instance for this parser.

Parameters:
id - identity of the object to be tracked.
delta - if the object was previously an unresolved delta, this is the delta object that was tracking it. Otherwise null.
deltaBase - if the object was previously an unresolved delta, this is the ObjectId of the base of the delta. The base may be outside of the pack stream if the stream was a thin-pack.
Returns:
info object containing this object's data.

onStoreStream

protected abstract void onStoreStream(byte[] raw,
                                      int pos,
                                      int len)
                               throws IOException
Store bytes received from the raw stream.

This method is invoked during parse(ProgressMonitor) as data is consumed from the incoming stream. Implementors may use this event to archive the raw incoming stream to the destination repository in large chunks, without paying attention to object boundaries.

The only component of the pack not supplied to this method is the last 20 bytes of the pack that comprise the trailing SHA-1 checksum. Those are passed to onPackFooter(byte[]).

Parameters:
raw - buffer to copy data out of.
pos - first offset within the buffer that is valid.
len - number of bytes in the buffer that are valid.
Throws:
IOException - the stream cannot be archived.

onObjectHeader

protected abstract void onObjectHeader(PackParser.Source src,
                                       byte[] raw,
                                       int pos,
                                       int len)
                                throws IOException
Store (and/or checksum) an object header.

Invoked after any of the onBegin() events. The entire header is supplied in a single invocation, before any object data is supplied.

Parameters:
src - where the data came from
raw - buffer to read data from.
pos - first offset within buffer that is valid.
len - number of bytes in buffer that are valid.
Throws:
IOException - the stream cannot be archived.

onObjectData

protected abstract void onObjectData(PackParser.Source src,
                                     byte[] raw,
                                     int pos,
                                     int len)
                              throws IOException
Store (and/or checksum) a portion of an object's data.

This method may be invoked multiple times per object, depending on the size of the object, the size of the parser's internal read buffer, and the alignment of the object relative to the read buffer.

Invoked after onObjectHeader(Source, byte[], int, int).

Parameters:
src - where the data came from
raw - buffer to read data from.
pos - first offset within buffer that is valid.
len - number of bytes in buffer that are valid.
Throws:
IOException - the stream cannot be archived.

onInflatedObjectData

protected abstract void onInflatedObjectData(PackedObjectInfo obj,
                                             int typeCode,
                                             byte[] data)
                                      throws IOException
Invoked for commits, trees, tags, and small blobs.

Parameters:
obj - the object info, populated.
typeCode - the type of the object.
data - inflated data for the object.
Throws:
IOException - the object cannot be archived.

onPackHeader

protected abstract void onPackHeader(long objCnt)
                              throws IOException
Provide the implementation with the original stream's pack header.

Parameters:
objCnt - number of objects expected in the stream.
Throws:
IOException - the implementation refuses to work with this many objects.

onPackFooter

protected abstract void onPackFooter(byte[] hash)
                              throws IOException
Provide the implementation with the original stream's pack footer.

Parameters:
hash - the trailing 20 bytes of the pack, this is a SHA-1 checksum of all of the pack data.
Throws:
IOException - the stream cannot be archived.

onAppendBase

protected abstract boolean onAppendBase(int typeCode,
                                        byte[] data,
                                        PackedObjectInfo info)
                                 throws IOException
Provide the implementation with a base that was outside of the pack.

This event only occurs on a thin pack for base objects that were outside of the pack and came from the local repository. Usually an implementation uses this event to compress the base and append it onto the end of the pack, so the pack stays self-contained.

Parameters:
typeCode - type of the base object.
data - complete content of the base object.
info - packed object information for this base. Implementors must populate the CRC and offset members if returning true.
Returns:
true if the info should be included in the object list returned by getSortedObjectList(Comparator), false if it should not be included.
Throws:
IOException - the base could not be included into the pack.

onEndThinPack

protected abstract void onEndThinPack()
                               throws IOException
Event indicating a thin pack has been completely processed.

This event is invoked only if a thin pack has delta references to objects external from the pack. The event is called after all of those deltas have been resolved.

Throws:
IOException - the pack cannot be archived.

seekDatabase

protected abstract PackParser.ObjectTypeAndSize seekDatabase(PackedObjectInfo obj,
                                                             PackParser.ObjectTypeAndSize info)
                                                      throws IOException
Reposition the database to re-read a previously stored object.

If the database is computing CRC-32 checksums for object data, it should reset its internal CRC instance during this method call.

Parameters:
obj - the object position to begin reading from. This is from newInfo(AnyObjectId, UnresolvedDelta, ObjectId).
info - object to populate with type and size.
Returns:
the info object.
Throws:
IOException - the database cannot reposition to this location.

seekDatabase

protected abstract PackParser.ObjectTypeAndSize seekDatabase(PackParser.UnresolvedDelta delta,
                                                             PackParser.ObjectTypeAndSize info)
                                                      throws IOException
Reposition the database to re-read a previously stored object.

If the database is computing CRC-32 checksums for object data, it should reset its internal CRC instance during this method call.

Parameters:
delta - the object position to begin reading from. This is an instance previously returned by onEndDelta().
info - object to populate with type and size.
Returns:
the info object.
Throws:
IOException - the database cannot reposition to this location.

readDatabase

protected abstract int readDatabase(byte[] dst,
                                    int pos,
                                    int cnt)
                             throws IOException
Read from the database's current position into the buffer.

Parameters:
dst - the buffer to copy read data into.
pos - position within dst to start copying data into.
cnt - ideal target number of bytes to read. Actual read length may be shorter.
Returns:
number of bytes stored.
Throws:
IOException - the database cannot be accessed.

checkCRC

protected abstract boolean checkCRC(int oldCRC)
Check the current CRC matches the expected value.

This method is invoked when an object is read back in from the database and its data is used during delta resolution. The CRC is validated after the object has been fully read, allowing the parser to verify there was no silent data corruption.

Implementations are free to ignore this check by always returning true if they are performing other data integrity validations at a lower level.

Parameters:
oldCRC - the prior CRC that was recorded during the first scan of the object from the pack stream.
Returns:
true if the CRC matches; false if it does not.

onBeginWholeObject

protected abstract void onBeginWholeObject(long streamPosition,
                                           int type,
                                           long inflatedSize)
                                    throws IOException
Event notifying the start of an object stored whole (not as a delta).

Parameters:
streamPosition - position of this object in the incoming stream.
type - type of the object; one of Constants.OBJ_COMMIT, Constants.OBJ_TREE, Constants.OBJ_BLOB, or Constants.OBJ_TAG.
inflatedSize - size of the object when fully inflated. The size stored within the pack may be larger or smaller, and is not yet known.
Throws:
IOException - the object cannot be recorded.

onEndWholeObject

protected abstract void onEndWholeObject(PackedObjectInfo info)
                                  throws IOException
Event notifying the the current object.

Parameters:
info - object information.
Throws:
IOException - the object cannot be recorded.

onBeginOfsDelta

protected abstract void onBeginOfsDelta(long deltaStreamPosition,
                                        long baseStreamPosition,
                                        long inflatedSize)
                                 throws IOException
Event notifying start of a delta referencing its base by offset.

Parameters:
deltaStreamPosition - position of this object in the incoming stream.
baseStreamPosition - position of the base object in the incoming stream. The base must be before the delta, therefore baseStreamPosition &lt; deltaStreamPosition. This is not the position returned by a prior end object event.
inflatedSize - size of the delta when fully inflated. The size stored within the pack may be larger or smaller, and is not yet known.
Throws:
IOException - the object cannot be recorded.

onBeginRefDelta

protected abstract void onBeginRefDelta(long deltaStreamPosition,
                                        AnyObjectId baseId,
                                        long inflatedSize)
                                 throws IOException
Event notifying start of a delta referencing its base by ObjectId.

Parameters:
deltaStreamPosition - position of this object in the incoming stream.
baseId - name of the base object. This object may be later in the stream, or might not appear at all in the stream (in the case of a thin-pack).
inflatedSize - size of the delta when fully inflated. The size stored within the pack may be larger or smaller, and is not yet known.
Throws:
IOException - the object cannot be recorded.

onEndDelta

protected PackParser.UnresolvedDelta onEndDelta()
                                         throws IOException
Event notifying the the current object.

Returns:
object information that must be populated with at least the offset.
Throws:
IOException - the object cannot be recorded.


Copyright © 2013. All Rights Reserved.