Wednesday, December 22, 2010
heLib release 0.3.1
Friday, September 10, 2010
Generic B+Trees (heLib release 0.3)
Sunday, September 5, 2010
heLib release 0.2
Wednesday, September 1, 2010
heLib release 0.1
- Generic vector containers with generic enumerators.
- Online documentation.
- Test suite.
- Permissive BSD-style license.
- Requires trunk FPC (2.5.1).
- Source tarball downloadable directly from the project home page.
- Subversion access
Sunday, August 22, 2010
Containers with multiple enumerators
It is impossible to choose among different possible enumerators. For example you can traverse a tree using different orders. The well known algorithms are: preorder, postorder, inorder and breadth‑first traversals. Therefore it would be useful to have an ability to choose an enumerator. For example using the following syntax:It is true that any type can have only one GetEnumerator method attached to it. But that's IMO not the same as "It is impossible to choose among different possible enumerators."
type TTreeEnumeratorType = (tePreOrder, tePostOrder, teInOrder, teBreadthFirst) procedure TraverseTree(Tree: TTree); var Node: TNode; begin // Variant1. For the class instances we can call the method Tree.GetEnumerator(teInOrder). // For the classes we can call a class method for Node in Tree using GetEnumerator(teInOrder) do Dosomething(Node); // Variant2. Or we can call the global function for Node in Tree using GetEnumerator(Tree, teInOrder) do Dosomething(Node); // Variant3. In the previous variant 'in Tree' is useless so the next code is a simplified form: for Node using GetEnumerator(Tree, teInOrder) do Dosomething(Node); // Variant4. We can try to avoid new context key-word 'using' by calling method: for Node in Tree.GetSomeEnumerator(teInOrder) do Dosomething(Node); // but this brings ambiguity to the compiler since Tree.GetSomeEnumerator(teInOrder) can be translated into // Tree.GetSomeEnumerator(teInOrder).GetEnumerator // This ambiguity might be resolvable by checking whether the class implements IEnumerator interface end;
I guess, i.e. this was not tested and not even tried - it is already possible to have different enumerators on a type. The desired behaviour could be achieved presumably in two ways.
Using IEnumerator?
Iff the current FPC implementation checks for existence of a IEnumerator derived implemented interface of a type (i.e. as opposed to checking for existence of exactly only IEnumerator interface), then it should be possible to have multiple distinct IEnumerator derived interfaces implemented by a type (each one returning a different enumerator object) and then "select" the enumerator like this:
// maybe parenthesis would be required bellow, i.e. (Tree as ITreeInOrder) for Node in Tree as ITreeInOrder do Dosomething(Node); for Node in Tree as ITreeBreadthFirst do Dosomething(Node);
Using the enumerator operator?
Another possibility is to declare multiple enumerator operators with differently typed operands and then cast the real type to select the specific enumerator:
type TInOrderTree = class(TTree); TBreadthFirstTree = class(TTree); operator Enumerator(Op: TInOrderTree): TTreeEnumerator; begin // implementation end; operator Enumerator(Op: TBreadthFirstTree): TTreeEnumerator; begin // different implementation end; //... for Node in TInOrderTree(Tree) do Dosomething(Node); for Node in TBreadthFirstTree(Tree) do Dosomething(Node);
The choice
The interface based variant seems a bit heavy to me. The casting is maybe somewhat lighter (per opinion), but at the same time a bit dangerous as type casts in FPC are - and have to be for a reason - usually a way how to tell the compiler not to take care of any type checking no more (an oversimplification, I admit, but ~applies to the current task). Both variants do also require declaration of new types, but that's a minor issue.
Let's get back to Paul's original proposal and specifically the Variant4. The observation "but this brings ambiguity to the compiler since Tree.GetSomeEnumerator(teInOrder) can be translated into Tree.GetSomeEnumerator(teInOrder).GetEnumerator" is correct. IMO the compiler must interpret the code exactly in this and only this way, but that's another story. Variant4 is also interesting here as it was chosen to play with in an attempt to get a usable case/implementation/usage pattern.
It's a matter of only personal preferences, but I like distinctly naming the specific "GetEnumerators" more than parameterizing a single one, so the container's client code will be something like:
for Node in Tree.InOrder do Dosomething(Node); for Node in Tree.BreadthFirst do Dosomething(Node);It's a few characters shorter and IMO pretty readable also. Conceptually, one should choose which enumerating behaviour should be the default one, supposedly based on the estimated typical usage pattern, and implement that enumerator as the (single) .GetEnumerator method of such type. All the other enumerators, if properly named, would make the intention clear. For example a TSomeMap could have a "default" enumerator returning the key-value pairs and the the specific/special .Key and .Value enumerators:
type TKey = String; TValue = Integer; TPair = record Key: TKey; Value: TValue; end; TMyMap = class(specialize TSomeMap<TKey, TValue>) // ... end; // ... var M: TMyMap; P: TPair; K: TKey; V: TValue; // ... for P in M do ProcessPair(P); for K in M.Key do ProcessKey(K); for V in M.Value do ProcessValue(V);
The implementation
Bellow are excerpts from heContnrs.pas implementing both the default (0 .. Count - 1) and the .Reversed (Count - 1 .. 0) enumerators of a vector.
//... interface type { TheEnumerator } generic TheEnumerator<TIterator, TValue> = object public type TGetCurrent = function(const Iterator: TIterator): TValue of object; TMoveNext = function(var Iterator: TIterator): Boolean of object; private FGetCurrent: TGetCurrent; FIterator: TIterator; FMoveNext: TMoveNext; function GetCurrent: TValue; public procedure Init(const InitialIterator: TIterator; const Mover: TMoveNext; const Getter: TGetCurrent); function MoveNext: Boolean; property Current: TValue read GetCurrent; end; { TheEnumeratorProvider } generic TheEnumeratorProvider<TProvidedEnumerator> = object public FEnumerator: TProvidedEnumerator; function GetEnumerator: TProvidedEnumerator; end; { TheVector } generic TheVector<TItem> = class public type //... TEnumerator = specialize TheEnumerator<Integer, TItem>; TReverseEnumeratorProvider = specialize TheEnumeratorProvider<TEnumerator>; private type PItem = ^TItem; private //... function GetItems(const Index: Integer): TItem; function MoveNext(var Index: Integer): Boolean; function MovePrev(var Index: Integer): Boolean; public //... function GetEnumerator: TEnumerator; function Reversed: TReverseEnumeratorProvider; end; implementation uses SysUtils; { TheEnumerator } function TheEnumerator.GetCurrent: TValue; begin Result := FGetCurrent(FIterator); end; function TheEnumerator.MoveNext: Boolean; begin Result := FMoveNext(FIterator); end; procedure TheEnumerator.Init(const InitialIterator: TIterator; const Mover: TMoveNext; const Getter: TGetCurrent); begin Assert(Assigned(Mover)); Assert(Assigned(Getter)); FIterator := InitialIterator; FMoveNext := Mover; FGetCurrent := Getter; end; { TheEnumeratorProvider } function TheEnumeratorProvider.GetEnumerator: TProvidedEnumerator; begin Result := FEnumerator; end; { TheVector } //... function TheVector.GetItems(const Index: Integer): TItem; begin Assert((Index >= 0) and (Index < Count)); Result := FData[Index]; end; function TheVector.MoveNext(var Index: Integer): Boolean; begin Inc(Index); Result := Index < Count; end; function TheVector.MovePrev(var Index: Integer): Boolean; begin Dec(Index); Result := Index >= 0; end; function TheVector.GetEnumerator: TEnumerator; begin Result.Init(-1, @MoveNext, @GetItems); end; function TheVector.Reversed: TReverseEnumeratorProvider; begin Result.FEnumerator.Init(Count, @MovePrev, @GetItems); end; end.
The example
This code ...
program project1; {$mode objfpc}{$H+} uses heContnrs; type TStrVector = specialize TheVector<String>; var V: TStrVector; S: String; begin {$if declared(HaltOnNotReleased)} HaltOnNotReleased := True; {$endif} V := TStrVector.Create; try V.Add('foo'); V.Add('bar'); V.Add('baz'); Writeln('for S in V ...'); for S in V do Writeln('S = ''', S, ''''); Writeln('for S in V.Reversed ...'); for S in V.Reversed do Writeln('S = ''', S, ''''); finally V.Free; end; end.... compiled and executed, produces this output:
$ ./project1 for S in V ... S = 'foo' S = 'bar' S = 'baz' for S in V.Reversed ... S = 'baz' S = 'bar' S = 'foo' Heap dump by heaptrc unit 9 memory blocks allocated : 892/912 9 memory blocks freed : 892/912 0 unfreed memory blocks : 0 True heap size : 360448 True free heap : 360448 $
The credits
The authorship of the original ideas on multiple enumerators, leading to this post and the above implementation, should be fully credited to and only to Paul Ishenin. My best thanks to him for sharing his work.Tuesday, August 17, 2010
Introducing heLib
ATM one can find there a work in progress state of a generic [object] vector with generic enumerators, roughly/conceptually corresponding to this older post.
Thursday, May 20, 2010
Sampler: GDBM OO wrapper
unit Unit1; { (c) 2010 bflm, contact: befelemepeseveze at Google's free mail Free sample code - copy, use, modify, distribute, sell, release under other license, ... } {$mode objfpc}{$h+} interface const DEFAULT_FMODE = 432; // =0660(8) type { TGDBM } TGDBM = class private FCacheSize: Integer; FCentFree: Longbool; FCoalesceBlks: Longbool; FDBF: Pointer; FSyncMode: Longbool; function GetFDesc: Integer; procedure Open(Name: String; OpenRW: Integer; const Sync, NoLock: Boolean; BlockSize: Integer; Mode: Integer); procedure SetCacheSize(const AValue: Integer); procedure SetCentFree(const AValue: Longbool); procedure SetCoalesceBlks(const AValue: Longbool); procedure SetSyncMode(const AValue: Longbool); public constructor CreateNewDB(const Name: String; const Sync: Boolean = False; const NoLock: Boolean = False; const BlockSize: Integer = 0; const FMode: Integer = DEFAULT_FMODE); constructor CreateOrOpen(const Name: String; const Sync: Boolean = False; const NoLock: Boolean = False; const BlockSize: Integer = 0; const FMode: Integer = DEFAULT_FMODE); constructor CreateReader(const Name: String); constructor CreateWriter(const Name: String; const Sync: Boolean = False; const NoLock: Boolean = False); destructor Destroy; override; { True if Key deleted } function Delete(const Key: String): Boolean; { True if Key deleted } function Delete(const Key; const KeyLen: Integer): Boolean; function Exists(const Key: String): Boolean; function Exists(const Key; const KeyLen: Integer): Boolean; { True if Key found => Value set } function Fetch(const Key: String; out Value: String): Boolean; { True if Key found => Value set. Value MUST be disposed using C free. } function Fetch(const Key; const KeyLen: Integer; out Value: Pointer; out Len: Integer): Boolean; { True if FirstKey returned } function FirstKey(out Key: String): Boolean; { True if FirstKey returned } function FirstKey(out Key: Pointer; out Len: Integer): Boolean; { True if Key already existed => new Value NOT stored } function Insert(const Key, Value: String): Boolean; { True if Key already existed => new Value NOT stored } function Insert(const Key; const KeyLen: Integer; const Value; const ValueLen: Integer): Boolean; { True if Next returned } function NextKey(const Key: String; var Next: String): Boolean; { True if Next returned. Key MUST be disposed using C free. } function NextKey(const Key; const KeyLen: Integer; var Next: Pointer; var Len: Integer): Boolean; procedure Reorganize; procedure Replace(const Key, Value: String); procedure Replace(const Key; const KeyLen: Integer; const Value; const ValueLen: Integer); procedure Sync; property CacheSize: Integer read FCacheSize Write SetCacheSize; property CentFree: Longbool read FCentFree Write SetCentFree; property CoalesceBlks: Longbool read FCoalesceBlks Write SetCoalesceBlks; property FDesc: Integer read GetFDesc; property SyncMode: Longbool read FSyncMode Write SetSyncMode; end; implementation uses SysUtils, gdbm; procedure CFree(P: pointer); cdecl; external 'c' Name 'free'; function Datum(const Buf: Pointer; const Len: Integer): TDatum; inline; begin Result.dptr := Buf; Result.dsize := Len; end; procedure Fail; begin raise Exception.Create(gdbm_strerror(gdbm_errno)); end; function CheckError(const Ret: Integer): Integer; begin if Ret <> 0 then Fail; Result := Ret; end; operator := (const Datum: TDatum): String; begin if Datum.dptr <> nil then begin SetLength(Result, Datum.dsize); Move(Datum.dptr^, Result[1], Datum.dsize); CFree(Datum.dptr); end else Result := ''; end; operator := (const S: String): TDatum; begin Assert(S <> ''); Result.dptr := PChar(S); Result.dsize := Length(S); end; { TGDBM } constructor TGDBM.CreateNewDB(const Name: String; const Sync, NoLock: Boolean; const BlockSize: Integer = 0; const FMode: Integer = DEFAULT_FMODE); begin inherited Create; Open(Name, GDBM_NEWDB, Sync, NoLock, BlockSize, FMode); end; constructor TGDBM.CreateOrOpen(const Name: String; const Sync, NoLock: Boolean; const BlockSize: Integer; const FMode: Integer); begin inherited Create; Open(Name, GDBM_WRCREAT, Sync, NoLock, BlockSize, FMode); end; constructor TGDBM.CreateReader(const Name: String); begin inherited Create; Open(Name, GDBM_READER, False, False, 0, 0); end; constructor TGDBM.CreateWriter(const Name: String; const Sync, NoLock: Boolean); begin inherited Create; Open(Name, GDBM_WRITER, Sync, NoLock, 0, 0); end; destructor TGDBM.Destroy; begin if FDBF <> nil then gdbm_close(FDBF); inherited; end; function TGDBM.Delete(const Key: String): Boolean; var K: TDatum; begin K := Key; Result := gdbm_delete(FDBF, K) = 0; end; function TGDBM.Delete(const Key; const KeyLen: Integer): Boolean; begin Result := gdbm_delete(FDBF, Datum(@Key, KeyLen)) = 0; end; function TGDBM.Exists(const Key: String): Boolean; var K: TDatum; begin K := Key; Result := gdbm_exists(FDBF, K) <> 0; end; function TGDBM.Exists(const Key; const KeyLen: Integer): Boolean; begin Result := gdbm_exists(FDBF, Datum(@Key, KeyLen)) <> 0; end; function TGDBM.Fetch(const Key: String; out Value: String): Boolean; var K, V: TDatum; begin K := Key; V := gdbm_fetch(FDBF, K); if V.dptr <> nil then begin Value := V; Exit(True); end; Result := False; end; function TGDBM.Fetch(const Key; const KeyLen: Integer; out Value: Pointer; out Len: Integer): Boolean; var V: TDatum; begin V := gdbm_fetch(FDBF, Datum(@Key, KeyLen)); if V.dptr = nil then Exit(False); Value := V.dptr; Len := V.dsize; Result := True; end; function TGDBM.FirstKey(out Key: String): Boolean; var K: TDatum; begin K := gdbm_firstkey(FDBF); if K.dptr <> nil then begin Key := K; Exit(True); end; Result := False; end; function TGDBM.FirstKey(out Key: Pointer; out Len: Integer): Boolean; var K: TDatum; begin K := gdbm_firstkey(FDBF); if K.dptr <> nil then begin Key := K.dptr; Len := K.dsize; Exit(True); end; Result := False; end; procedure TGDBM.Open(Name: String; OpenRW: Integer; const Sync, NoLock: Boolean; BlockSize: Integer; Mode: Integer); begin Assert(Name <> ''); FSyncMode := Sync; if SyncMode then OpenRW += GDBM_DOSYNC; if NoLock then; OpenRW += GDBM_NOLOCK; FDBF := gdbm_open(PChar(Name), BlockSize, OpenRW, Mode, nil); if FDBF = nil then Fail; end; function TGDBM.GetFDesc: Integer; begin Result := gdbm_fdesc(FDBF); end; procedure TGDBM.SetCacheSize(const AValue: Integer); var Ret: Integer; begin if FCacheSize = AValue then exit; FCacheSize := AValue; Ret := gdbm_setopt(FDBF, GDBM_CACHESIZE, @FCacheSize, SizeOf(FCacheSize)); if Ret <> 0 then Fail; end; procedure TGDBM.SetCentFree(const AValue: Longbool); begin if FCentFree = AValue then exit; FCentFree := AValue; CheckError(gdbm_setopt(FDBF, GDBM_CENTFREE, @FCentFree, SizeOf(FCentFree))); end; procedure TGDBM.SetCoalesceBlks(const AValue: Longbool); begin if FCoalesceBlks = AValue then exit; FCoalesceBlks := AValue; CheckError(gdbm_setopt(FDBF, GDBM_COALESCEBLKS, @FCoalesceBlks, SizeOf(FCoalesceBlks))); end; procedure TGDBM.SetSyncMode(const AValue: Longbool); begin if FSyncMode = AValue then exit; FSyncMode := AValue; CheckError(gdbm_setopt(FDBF, GDBM_SYNCMODE, @FSyncMode, SizeOf(FSyncMode))); end; function TGDBM.Insert(const Key, Value: String): Boolean; var K, V: TDatum; Ret: Integer; begin K := Key; V := Value; Ret := gdbm_store(FDBF, K, V, GDBM_INSERT); ; if Ret < 0 then // reader attempts insert Fail; Result := Ret <> 0; end; function TGDBM.Insert(const Key; const KeyLen: Integer; const Value; const ValueLen: Integer): Boolean; var Ret: Integer; begin Ret := gdbm_store(FDBF, Datum(@Key, KeyLen), Datum(@Value, ValueLen), GDBM_INSERT); if Ret < 0 then // reader attempts insert Fail; Result := Ret <> 0; end; function TGDBM.NextKey(const Key: String; var Next: String): Boolean; var K, N: TDatum; begin K := Key; N := gdbm_nextkey(FDBF, K); if N.dptr <> nil then begin Next := N; Exit(True); end; Result := False; end; function TGDBM.NextKey(const Key; const KeyLen: Integer; var Next: Pointer; var Len: Integer): Boolean; var N: TDatum; begin N := gdbm_nextkey(FDBF, Datum(@Key, KeyLen)); if N.dptr <> nil then begin Next := N.dptr; Len := N.dsize; Exit(True); end; Result := False; end; procedure TGDBM.Reorganize; var Ret: Integer; begin Ret := gdbm_reorganize(FDBF); if Ret <> 0 then Fail; end; procedure TGDBM.Replace(const Key, Value: String); var K, V: TDatum; begin K := Key; V := Value; CheckError(gdbm_store(FDBF, K, V, GDBM_REPLACE)); end; procedure TGDBM.Replace(const Key; const KeyLen: Integer; const Value; const ValueLen: Integer); begin CheckError(gdbm_store(FDBF, Datum(@Key, KeyLen), Datum(@Value, ValueLen), GDBM_REPLACE)); end; procedure TGDBM.Sync; begin gdbm_sync(FDBF); end; end.
Monday, May 10, 2010
Sampler: Generic enumerators for generic containers
unit Unit1; { (c) 2010 bflm, contact: befelemepeseveze at Google's free mail Free sample code - copy, use, modify, distribute, sell, release under other license, ... } {$mode objfpc}{$H+} interface type { TEnumerator } generic TEnumerator<TValue, TIterator> = object public type TGetCurrent = function(const Iterator: TIterator): TValue of object; TMoveNext = function(var Iterator: TIterator): Boolean of object; private var FGetCurrent: TGetCurrent; FMoveNext: TMoveNext; FIterator: TIterator; function GetCurrent: TValue; public procedure Init(const InitialIterator: TIterator; const Mover: TMoveNext; const Getter: TGetCurrent); function MoveNext: Boolean; property Current: TValue read GetCurrent; end; { TVector } generic TVector<T> = object public type TData = array of T; TCompare = function(const A, B: T): Integer; TVectorEnumerator = specialize TEnumerator<T, Integer>; private var FData: TData; FLen: Integer; private function GetCap: Integer; function GetData: TData; function GetItems(const Index: Integer): T; inline; function GetLen: Integer; inline; function MoveNext(var Index: Integer): Boolean; procedure Sort(Left, Right: Integer; const Compare: TCompare); procedure SetCap(const AValue: Integer); procedure SetData(const AValue: TData); procedure SetItems(const Index: Integer; const AValue: T); inline; procedure SetLen(const AValue: Integer); public function GetEnumerator: TVectorEnumerator; function Add(const Value: T): Integer; procedure Clear; procedure Pack; procedure Sort(const Compare: TCompare); property Cap: Integer read GetCap write SetCap; property Data: TData read GetData write SetData; property Items[const Index: Integer]: T read GetItems write SetItems; default; property Len: Integer read GetLen write SetLen; end; implementation { TEnumerator } function TEnumerator.GetCurrent: TValue; begin Result := FGetCurrent(FIterator); end; function TEnumerator.MoveNext: Boolean; begin Result := FMoveNext(FIterator); end; procedure TEnumerator.Init(const InitialIterator: TIterator; const Mover: TMoveNext; const Getter: TGetCurrent); begin Assert(Assigned(Mover)); Assert(Assigned(Getter)); FIterator := InitialIterator; FMoveNext := Mover; FGetCurrent := Getter; end; { TVector } function TVector.Add(const Value: T): Integer; var NeedCap: Integer; begin NeedCap := Len + 1; if NeedCap > Cap then Cap := 2 * NeedCap; Result := FLen; Inc(FLen); Items[Result] := Value; end; procedure TVector.Clear; begin Cap := 0; end; function TVector.GetItems(const Index: Integer): T; begin Result := FData[Index]; end; function TVector.GetCap: Integer; begin Result := Length(FData); end; function TVector.GetData: TData; begin Result := Copy(FData, 0, Len); end; function TVector.GetEnumerator: TVectorEnumerator; begin Result.Init(-1, @MoveNext, @GetItems); end; function TVector.GetLen: Integer; begin if FData = nil then FLen := 0; // boot Result := FLen; end; function TVector.MoveNext(var Index: Integer): Boolean; begin Result := Index + 1 < Len; if Result then Inc(Index); end; procedure TVector.SetCap(const AValue: Integer); begin if AValue = Cap then Exit; if Len > AValue then FLen := AValue; SetLength(FData, AValue); end; procedure TVector.SetData(const AValue: TData); begin FData := AValue; FLen := Cap; end; procedure TVector.SetItems(const Index: Integer; const AValue: T); begin FData[Index] := AValue; end; procedure TVector.SetLen(const AValue: Integer); begin if AValue = Len then Exit; if AValue > Cap then Cap := AValue else if AValue <= Cap div 2 then Cap := Cap div 2; FLen := AValue; end; procedure TVector.Sort(Left, Right: Integer; const Compare: TCompare); var L, R: Integer; Pivot, Swap: T; begin repeat L := Left; R := Right; Pivot := Items[(Left + Right) div 2]; repeat while Compare(Pivot, Items[L]) > 0 do L += 1; while Compare(Pivot, Items[R]) < 0 do R -= 1; if L <= R then begin Swap := Items[L]; Items[L] := Items[R]; Items[R] := Swap; L += 1; R -= 1; end; until L > R; if Left < R then Sort(Left, R, Compare); Left := L; until L >= Right; end; procedure TVector.Pack; begin Cap := Len; end; procedure TVector.Sort(const Compare: TCompare); begin if Len > 1 then Sort(0, FLen - 1, Compare); end; end.
program project1; {$mode objfpc}{$H+} uses Unit1; type TIntVector = specialize TVector<Integer>; function IntCmp(const A, B: Integer): Integer; begin Result := A - B; end; procedure Dump(const Vector: TIntVector); var V: Integer; begin Write(Vector.Len, ' item(s): '); for V in Vector do Write(V, ' '); Writeln; end; procedure Test(const N, X: Integer); var Vector: TIntVector; I: Integer; begin for I := 1 to N do Vector.Add(I xor X); Dump(Vector); Vector.Sort(@IntCmp); Dump(Vector); WriteLn; end; begin // main {$if declared(HaltOnNotReleased)} HaltOnNotReleased := True; {$endif} Test(0, 0); Test(1, 0); Test(11, 0); Test(13, $55); Test(17, $AA); Test(19, $FF); end.
$ fpc -B -gh project1 && ./project1 Free Pascal Compiler version 2.5.1 [2010/05/05] for x86_64 Copyright (c) 1993-2009 by Florian Klaempfl Target OS: Linux for x86-64 Compiling project1.pas Compiling unit1.pas Linking project1 /usr/bin/ld: warning: link.res contains output sections; did you forget -T? 259 lines compiled, 0.1 sec 0 item(s): 0 item(s): 1 item(s): 1 1 item(s): 1 11 item(s): 1 2 3 4 5 6 7 8 9 10 11 11 item(s): 1 2 3 4 5 6 7 8 9 10 11 13 item(s): 84 87 86 81 80 83 82 93 92 95 94 89 88 13 item(s): 80 81 82 83 84 86 87 88 89 92 93 94 95 17 item(s): 171 168 169 174 175 172 173 162 163 160 161 166 167 164 165 186 187 17 item(s): 160 161 162 163 164 165 166 167 168 169 171 172 173 174 175 186 187 19 item(s): 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 19 item(s): 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 Heap dump by heaptrc unit 11 memory blocks allocated : 840/840 11 memory blocks freed : 840/840 0 unfreed memory blocks : 0 True heap size : 98304 True free heap : 98304 $
Wednesday, May 5, 2010
Unicode Category Membership Checking
Wednesday, February 10, 2010
String Buffer Performance
program project1; {$mode objfpc}{$H+} uses Classes, DateUtils, SysUtils, StrBuf; type TBench = procedure(const PartSize, MaxSize: Integer); procedure BuildString(const PartSize, MaxSize: Integer); var Buf: String; BufSize: Integer; begin Buf := ''; BufSize := 0; while BufSize < MaxSize do begin Buf += StringOfChar(' ', PartSize); BufSize += PartSize; end; end; procedure BuildStringList(const PartSize, MaxSize: Integer); var Buf: TStringList; BufSize: Integer; begin Buf := TStringList.Create; BufSize := 0; while BufSize < MaxSize do begin Buf.Add(StringOfChar(' ', PartSize)); BufSize += PartSize; end; Buf.Free; end; procedure BuildStrBuf(const PartSize, MaxSize: Integer); var Buf: TStrBuf; BufSize: Integer; begin BufSize := 0; while BufSize < MaxSize do begin Buf.W(StringOfChar(' ', PartSize)); BufSize += PartSize; end; end; procedure Build(const PartSize, MaxSize: Integer; Bench: TBench; const Name: String); const RUN = 5; var T0: TDateTime; Msec: Int64; I: Integer; begin T0 := Now; for I := 1 to RUN do Bench(PartSize, MaxSize); MSec := MilliSecondsBetween(Now, T0); Writeln(Format('partsize: %.5d T: %.7dms %s ', [PartSize, MSec, Name])); end; const UBOUND_PART = 100; UBOUND_BUF = 1000000; var PartSize: Integer; begin PartSize := 1; repeat Build(PartSize, UBOUND_BUF, @BuildString, 'String'); Build(PartSize, UBOUND_BUF, @BuildStringList, 'StringList'); Build(PartSize, UBOUND_BUF, @BuildStrBuf, 'StrBuf'); PartSize := ((3 * PartSize) + 1) div 2; until PartSize > UBOUND_PART; end.
$ fpc -B ./project1 && ./project1 | sort Free Pascal Compiler version 2.5.1 [2010/02/07] for x86_64 Copyright (c) 1993-2009 by Florian Klaempfl Target OS: Linux for x86-64 Compiling ./project1.pas Compiling strbuf.pas Linking project1 /usr/bin/ld: warning: link.res contains output sections; did you forget -T? 189 lines compiled, 0.2 sec partsize: 00001 T: 0000446ms StrBuf partsize: 00001 T: 0000602ms String partsize: 00001 T: 0001150ms StringList partsize: 00002 T: 0000233ms StrBuf partsize: 00002 T: 0000261ms String partsize: 00002 T: 0000534ms StringList partsize: 00003 T: 0000159ms StrBuf partsize: 00003 T: 0000190ms String partsize: 00003 T: 0000379ms StringList partsize: 00005 T: 0000101ms StrBuf partsize: 00005 T: 0000127ms String partsize: 00005 T: 0000211ms StringList partsize: 00008 T: 0000076ms StrBuf partsize: 00008 T: 0000086ms String partsize: 00008 T: 0000167ms StringList partsize: 00012 T: 0000050ms StrBuf partsize: 00012 T: 0000060ms String partsize: 00012 T: 0000100ms StringList partsize: 00018 T: 0000038ms StrBuf partsize: 00018 T: 0000043ms String partsize: 00018 T: 0000069ms StringList partsize: 00027 T: 0000030ms StrBuf partsize: 00027 T: 0000033ms String partsize: 00027 T: 0000045ms StringList partsize: 00041 T: 0000023ms String partsize: 00041 T: 0000024ms StrBuf partsize: 00041 T: 0000034ms StringList partsize: 00062 T: 0000015ms StrBuf partsize: 00062 T: 0000020ms String partsize: 00062 T: 0000020ms StringList partsize: 00093 T: 0000014ms String partsize: 00093 T: 0000015ms StringList partsize: 00093 T: 0000017ms StrBuf $
StrBuf's performance looks good or better compared to TStringList and/or String concatenating while you're appending short pieces of text, say up to few dozens of characters and that's what it was intended and written for.
Tuesday, February 9, 2010
Simple String Buffer
unit StrBuf; {$mode objfpc}{$H+} { Simple string buffer No setup/teardown required, memory mngmt, initialization, and finalization is done by FPC itself. Copyright (C) 2010 bflm, contact: befelemepeseveze at Google's free mail This library is free software; you can redistribute it and/or modify it under the terms of the GNU Library General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version with the following modification: As a special exception, the copyright holders of this library give you permission to link this library with independent modules to produce an executable, regardless of the license terms of these independent modules,and to copy and distribute the resulting executable under terms of your choice, provided that you also meet, for each linked independent module, the terms and conditions of the license of that module. An independent module is a module which is not derived from or based on this library. If you modify this library, you may extend this exception to your version of the library, but you are not obligated to do so. If you do not wish to do so, delete this exception statement from your version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Library General Public License for more details. You should have received a copy of the GNU Library General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. } interface uses SysUtils; type { TStrBuf } TStrBuf = object private FLen: Integer; FS: String; function GetS: String; public procedure W(const S: String); procedure W(const Fmt: String; Args: array of const); property S: String read GetS; end; operator +(var Buf: TStrBuf; const S: String): TStrBuf; operator :=(const S: String): TStrBuf; operator :=(var Buf: TStrBuf): String; implementation { TStrBuf } function TStrBuf.GetS: String; begin if FS = '' then FLen := 0; Result := LeftStr(FS, FLen); end; procedure TStrBuf.W(const S: String); var Len, Len1: Integer; begin Len := Length(S); if Len = 0 then Exit; if FS = '' then FLen := 0; Len1 := FLen + Len; if Len1 > Length(FS) then SetLength(FS, 2 * Len1); Move(S[1], FS[FLen + 1], Len); FLen += Len; end; procedure TStrBuf.W(const Fmt: String; Args: array of const); begin W(Format(Fmt, Args)); end; operator +(var Buf: TStrBuf; const S: String): TStrBuf; begin Result.FS := Buf.FS; Result.FLen := Buf.FLen; Result.W(S); end; operator :=(const S: String): TStrBuf; begin Result.FLen := Length(S); Result.FS := S; end; operator :=(var Buf: TStrBuf): String; begin Result := Buf.S; end; end.
program project1; {$mode objfpc}{$H+} // Simple string buffer example uses heaptrc, StrBuf; function F(N: Integer): String; var Buf: TStrBuf; I: Integer; begin for I := 1 to N do Buf.W('line %d%s', [I, sLineBreak]); Result := Buf.S; end; var Buf, Buf2: TStrBuf; begin Buf.W('abc'); Buf.W('%d', [123]); Buf += 'xyz'; Writeln(Buf.S); Writeln(String(Buf)); Buf := '456'; Writeln(Buf.S); Buf2 := Buf + ' bar'; Buf += ' foo'; Writeln(Buf.S); Writeln(Buf2.S); Write(F(10)); end.
$ fpc -B project1.pas && ./project1 Free Pascal Compiler version 2.5.1 [2010/02/07] for x86_64 Copyright (c) 1993-2009 by Florian Klaempfl Target OS: Linux for x86-64 Compiling project1.pas Compiling strbuf.pas Linking project1 /usr/bin/ld: warning: link.res contains output sections; did you forget -T? 148 lines compiled, 0.2 sec abc123xyz abc123xyz 456 456 foo 456 bar line 1 line 2 line 3 line 4 line 5 line 6 line 7 line 8 line 9 line 10 Heap dump by heaptrc unit 71 memory blocks allocated : 2926/3200 71 memory blocks freed : 2926/3200 0 unfreed memory blocks : 0 True heap size : 360448 True free heap : 360448 $