Files
SAS_PACKAGES/packages/baseplus.md
yabwon f7485ce6c4 **SAS Packages Framework**, version 20210528
**SAS Packages Framework**, version 20210528:

Help tags selection modified in the `%generatePackage()` macro.
New solution allows to write help tags surrounding comments in two ways.
The first (old) is:
```
/*** HELP START ***/
/*
comment
*/
/*** HELP END ***/
```

and the second (new):
```
/*** HELP START ***//*
comment
*//*** HELP END ***/
```
The second allows to print help info in log without `/*` and `*/` surrounding comments. It looks better and is easier for building `.md` files or other help documents (so you do not have to remove `/*` by hand).

Documentation updated.

The following packages were regenerated with new version of the SPF:
- BasePlus
- DFA
- dynMacroArray
- macroArray
- SQLinDS
2021-05-28 11:47:28 +02:00

89 KiB


The BasePlus package [ver. 0.991]

The BasePlus package implements useful functions and functionalities I miss in the BASE SAS.

It is inspired by various people, e.g.

  • at the SAS-L discussion list
  • at the communities.sas.com (SASware Ballot Ideas)
  • at the Office...
  • etc.

Kudos to all who inspired me to generate this package: Mark Keintz, Paul Dorfman, Richard DeVenezia, Christian Graffeuille.


BASIC EXAMPLES AND USECASES:

Example 1: One-dimensional array functions. Array parameters to subroutine calls must be 1-based.

  data _null_;
   array X[4] _temporary_ (. 1 . 2);

   call arrMissToRight(X);
    do i = 1 to 4;
     put X[i]= @;
    end;
    put;

   call arrFillMiss(17, X);
    do i = 1 to 4;
     put X[i]= @;
    end;
    put;

   call arrFill(42, X);
    do i = 1 to 4;
     put X[i]= @;
    end;
  run;

Example 2: Delete dataset by name.

  data toDrop;
    x = 17;
  run;
  data _null_;
    p = delDataset("toDrop");
    put p=;
  run;

Example 3: Strings concatenation with format.

  data test;
    x =  1 ; y =  . ; z =  3 ;
    t = "t"; u = " "; v = "v";

    array a[*] x y z;
    array b[*] t u v;

    length s1 s2 s3 s4 $ 17;
    s1 = catXFn("z5.", "#", A);
    s2 = catXFi("z5.", "#", A);
    s3 = catXFc("upcase.", "*", B);
    s4 = catXFj("upcase.", "*", B);

    put (_all_) (=);
  run;

Example 4: Useful formats.

  data _null_;
    input x @@;
    put @1 x= @11 x= bool. @21 x= int. @31 x= ceil. @41 x= floor.;
  cards;
  . ._ .A -10 -3.14 0 3.14 10
  ;
  run;

Example 5: Getting variables names from datasets.

  %put *%getVars(sashelp.class
                ,pattern   = ght$
                ,sep      = +
                ,varRange = _numeric_)*;

Example 6: Quick sort as an alternative to call sortn()

  data _null_;
    array test[25000000] _temporary_ ;

    t = time();
      call streaminit(123);
      do _N_ = 25000000 to 1 by -1;
        test[_N_] = rand("uniform");
      end;
    t = time() - t;
    put "Array population time: "  t;

    t = time();
      call quickSortLight (test);
    t = time()-t;
    put "Sorting time: " / t=;
  run;

Example 7: Deduplicate values from a space separated list.

  %let list = 4 5 6 1 2 3 1 2 3 4 5 6;
  %put *%dedupListS(&list.)*;

Example 8: Zip elements of two space separated list.

%let x = %zipEvalf(1 2 3 4 5 6, 2018 2019 2020, argMd=5, function=MDY, format=date11.);
%put &=x;

Package contains:

  1. macro deduplistc
  2. macro deduplistp
  3. macro deduplists
  4. macro deduplistx
  5. macro getvars
  6. macro qdeduplistx
  7. macro qgetvars
  8. macro qzipevalf
  9. macro symdelglobal
  10. macro zipevalf
  11. format bool
  12. format boolz
  13. format ceil
  14. format floor
  15. format int
  16. functions arrfill
  17. functions arrfillc
  18. functions arrmissfill
  19. functions arrmissfillc
  20. functions arrmisstoleft
  21. functions arrmisstoleftc
  22. functions arrmisstoright
  23. functions arrmisstorightc
  24. functions bracketsc
  25. functions bracketsn
  26. functions catxfc
  27. functions catxfi
  28. functions catxfj
  29. functions catxfn
  30. functions deldataset
  31. functions semicolonc
  32. functions semicolonn
  33. format brackets
  34. format semicolon
  35. proto qsortincbyprocproto
  36. functions frommissingtonumberbs
  37. functions fromnumbertomissing
  38. functions quicksort4notmiss
  39. functions quicksorthash
  40. functions quicksorthashsddv
  41. functions quicksortlight

SAS package generated by generatePackage, version 20210109

The SHA256 hash digest for package BasePlus: A321A4BC54D444B82575EC5D443553A096557AD69DC171D578A330277E67637A


Content description

>>> %getVars() macro: <<<

The getVars() and QgetVars() macro functions allow to extract variables names form a dataset according to a given pattern into a list.

The getVars() returns unquoted value [by %unquote()]. The QgetVars() returns quoted value [by %superq()].

See examples below for the details.

The %getVars() macro executes like a pure macro code.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%getVars(
   ds               
 <,sep=>
 <,pattern=>
 <,varRange=>
 <,quote=>
 <,mcArray=> 
)

Arguments description:

  1. ds - Required, the name of the dataset from which variables are to be taken.
  • sep = %str( ) - Optional, default value %str( ), a variables separator on the created list.

  • pattern = .* - Optional, default value .* (i.e. any text), a variable name regexp pattern, case INSENSITIVE!

  • varRange = _all_ - Optional, default value _all_, a named range list of variables.

  • quote = - Optional, default value is blank, a quotation symbol to be used around values.

  • mcArray= - Optional, default value is blank. 1) When null - the macro behaves like a macro function and returns a text string with variables list. 2) When not null - behaviour of the macro is altered. In such case a macro array of selected variables, named with mcArray value as a prefix, is created. Furthermore a macro named as mcArray value is generated. (see the macroArray package for the details). When mcArray= parameter is active the getVars macro cannot be called within the %put statement. Execution like: %put %getVars(..., mcArray=XXX); will result with
    an Explicit & Radical Refuse Of Run (aka ERROR).

EXAMPLES AND USECASES:

EXAMPLE 1. A list of all variables from the sashelp.class dataset:

  %put *%getVars(sashelp.class)*;

EXAMPLE 2. A list of all variables from the sashelp.class dataset separated by backslash:

  %let x = %getVars(sashelp.class, sep=\);
  %put &=x;

EXAMPLE 3. Use of regular expressions: a) A list of variables which name contains "i" or "a"

  %put *%getVars(sashelp.class, pattern=i|a)*;

b) A list of variables which name starts with "w"

  %put *%getVars(sashelp.class, pattern=^w)*;

c) A list of variables which name ends with "ght"

  %put *%getVars(sashelp.class, pattern=ght$)*;

EXAMPLE 4. A list of numeric variables which name starts with "w" or "h" or ends with "x"

  %put *%getVars(sashelp.class, sep=+, pattern=^(w|h)|x$, varRange=_numeric_)*;

EXAMPLE 5.

  data test;
    array x[30];
    array y[30] $ ;
    array z[30];
  run;

a) A list of variables separated by a comma:

  %put *%getVars(test, sep=%str(,))*;

b) A list of variables separated by a comma with suffix 5 or 7:

  %put *%getVars(test, sep=%str(,), pattern=(5|7)$)*;

c) A list of variables separated by a comma with suffix 5 or 7 from a given variables range:

  %put *%getVars(test, sep=%str(,), varRange=x10-numeric-z22 y6-y26, pattern=(5|7)$)*;

EXAMPLE 6. Case of quotes and special characters when the quote= parameter is not used:

a) one single or double qiote:

  %put *%bquote(%getVars(sashelp.class, sep=%str(%")))*;
  %put *%bquote(%getVars(sashelp.class, sep=%str(%')))*;

b) two single or double qiotes:

  %put *"%bquote(%getVars(sashelp.class,sep=""))"*;
  %put *%str(%')%bquote(%getVars(sashelp.class,sep=''))%str(%')*;

c) coma separated double quote list:

  %put *"%getVars(sashelp.class,sep=%str(", "))"*;

d) coma separated single quote list:

  %put *%str(%')%getVars(sashelp.class,sep=', ')%str(%')*;
  %let x = %str(%')%getVars(sashelp.class,sep=', ')%str(%');

  %put *%str(%')%QgetVars(sashelp.class,sep=', ')%str(%')*;
  %let y = %str(%')%QgetVars(sashelp.class,sep=', ')%str(%');
  %let z = %unquote(&y.);

e) ampersand (&) as a separator [compare behaviour]:

  %put *%getVars(sashelp.class,sep=&)*;
  %let x = %getVars(sashelp.class,sep=&);

  %put *%getVars(sashelp.class,sep=%str( & ))*;
  %let x = %getVars(sashelp.class,sep=%str( & ));

  %put *%QgetVars(sashelp.class,sep=&)*;
  %let y = %QgetVars(sashelp.class,sep=&);
  %let z = %unquote(&y.);

  %put *%QgetVars(sashelp.class,sep=%str( & ))*;
  %let y = %QgetVars(sashelp.class,sep=%str( & ));
  %let z = %unquote(&y.);

  %put *%getVars(sashelp.class,sep=&)*;
  %let x = %getVars(sashelp.class,sep=&);

  %put *%getVars(sashelp.class,sep=%str( & ))*;
  %let x = %getVars(sashelp.class,sep=%str( & ));

f) percent (%) as a separator [compare behaviour]:

  %put *%QgetVars(sashelp.class,sep=%)*;
  %let y = %QgetVars(sashelp.class,sep=%);
  %let z = %unquote(&y.);

  %put *%QgetVars(sashelp.class,sep=%str( % ))*;
  %let y = %QgetVars(sashelp.class,sep=%str( % ));
  %let z = %unquote(&y.);

EXAMPLE 7. Case of quotes and special characters when the quote= parameter is used:

a) one single or double qiote:

  %put *%getVars(sashelp.class, quote=%str(%"))*;
  %put *%getVars(sashelp.class, quote=%str(%'))*;

b) two single or double quotes:

  %* this gives an error:                   ;
  %* %put *%getVars(sashelp.class,quote="")*;
  %* %put *%getVars(sashelp.class,quote='')*;

  %* this does not give an error:         ;
  %put *%QgetVars(sashelp.class,quote="")*;
  %put *%QgetVars(sashelp.class,quote='')*;

c) coma separated double quote list:

  %put *%getVars(sashelp.class,sep=%str(,),quote=%str(%"))*;

d) coma separated single quote list:

  %let x = %getVars(sashelp.class,sep=%str(,),quote=%str(%'));
  %put &=x.;

EXAMPLE 8. Variables that start with A and do not end with GHT:

data class;
  set sashelp.class;
  Aeight  = height;
run;

%put *%getVars(class, pattern = ^A(.*)(?<!ght)$, quote=%str(%"))*;

EXAMPLE 9. Variables that do not start with N and do not end with GHT:

data class;
  set sashelp.class;
  Aeight  = height;
  Neight  = height;
run;

%put *%getVars(class, pattern = ^(?!N.*)(.*)(?<!ght)$, quote=%str(%"))*;

EXAMPLE 10. Composition with itself:

  data class;
    set sashelp.class;
    Age_C    = put(Age, best32.);
    Height_C = put(Height, best32.);
    Weight_C = put(Weight, best32.);
  run;

  %put #%getVars(class, varRange=_numeric_, sep=%str(: ))# <- no : at the end!!;

  %put #%getVars(class, varRange=%getVars(class, varRange=_numeric_, sep=%str(: )):, sep=\)#;

EXAMPLE 11. Create a macro array XYZ... of variables names and an additional macro %XYZ() which allows easy access to the list. Can be used with the %do_over() macro (provided with the macroArray package).

  data test;
    array x[30];
    array y[30] $ ;
    array z[30];
  run;

  %getVars(test
          ,mcArray=XYZ
          ,varRange=x10-numeric-z22 y6-y26
          ,pattern=(5|7)$
          ,quote=#)

  %put _user_;
  %put *%XYZ(1)**%XYZ(2)*%XYZ(3)*;
  
  %* Load the macroArray package first. ; 
  %put %do_over(XYZ);

>>> %QgetVars() macro: <<<

The getVars() and QgetVars() macro functions allow to extract variables names form a dataset according to a given pattern into a list.

The getVars() returns unquoted value [by %unquote()]. The QgetVars() returns quoted value [by %superq()].

The %QgetVars() macro executes like a pure macro code.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%QgetVars(
   ds               
 <,sep=>
 <,pattern=>
 <,varRange=>
 <,quote=>          
)

Arguments description:

  1. ds - Required, the name of the dataset from which variables are to be taken.
  • sep = %str( ) - Optional, default value %str( ), a variables separator on the created list.

  • pattern = .* - Optional, default value .* (i.e. any text), a variable name regexp pattern, case INSENSITIVE!

  • varRange = _all_ - Optional, default value _all_, a named range list of variables.

  • quote = - Optional, default value is blank, a quotation symbol to be used around values.

EXAMPLES AND USECASES:

See examples in %getVars() help for the details.


>>> %symdelGlobal() macro: <<<

The %symdelGlobal() macro deletes all global macrovariables created by the user. The only exceptions are read only variables and variables the one which starts with SYS, AF, or FSP. In that case a warning is printed in the log.

One temporary global macrovariable ________________98_76_54_32_10_ and a dataset, in work library, named _%sysfunc(datetime(),hex7.) are created and deleted during the process.

The %symdelGlobal() macro executes like a pure macro code.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%symdelGlobal(
 info
)

Arguments description:

  1. info - Optional, default value should be empty, if set to NOINFO or QUIET then infos and warnings about variables deletion are suspended.

EXAMPLES AND USECASES:

EXAMPLE 1. Basic use-case one. Delete global macrovariables, info notes and warnings are printed in the log.

  %let a = 1;
  %let b = 2;
  %let c = 3;
  %let sys_my_var = 11;
  %let  af_my_var = 22;
  %let fsp_my_var = 33;
  %global / readonly read_only_x = 1234567890;

  %put _user_;

  %symdelGlobal();

  %put _user_;

EXAMPLE 2. Basic use-case two. Delete global macrovariables in quite mode No info notes and warnings are printed in the log.

  %let a = 1;
  %let b = 2;
  %let c = 3;
  %let sys_my_var = 11;
  %let  af_my_var = 22;
  %let fsp_my_var = 33;
  %global / readonly read_only_x = 1234567890;

  %put _user_;
  %put *%symdelGlobal(NOINFO)*;
  %put _user_;

>>> bool. format: <<<

The bool format returns: zero for 0 or missing, one for other values.

EXAMPLES AND USECASES:

It allows for a %sysevalf()'ish conversion-type [i.e. %sysevalf(1.7 & 4.2, boolean)] inside the %sysfunc() [e.g. %sysfunc(aFunction(), bool.)]


>>> boolz. format: <<<

The boolz format returns: zero for 0 or missing, one for other values.

Fuzz value is 0.

EXAMPLES AND USECASES:

It allows for a %sysevalf()'ish conversion-type [i.e. %sysevalf(1.7 & 4.2, boolean)] inside the %sysfunc() [e.g. %sysfunc(aFunction(), boolz.)]


>>> ceil. format: <<<

The ceil format is a "wrapper" for the ceil() function.

EXAMPLES AND USECASES:

It allows for a %sysevalf()'ish conversion-type [i.e. %sysevalf(1.7 + 4.2, ceil)] inside the %sysfunc() [e.g. %sysfunc(aFunction(), ceil.)]


>>> floor. format: <<<

The floor format is a "wrapper" for the floor() function.

EXAMPLES AND USECASES:

It allows for a %sysevalf()'ish conversion-type [i.e. %sysevalf(1.7 + 4.2, floor)] inside the %sysfunc() [e.g. %sysfunc(aFunction(), floor.)]


>>> int. format: <<<

The int format is a "wrapper" for the int() function.

EXAMPLES AND USECASES:

It allows for a %sysevalf()'ish conversion-type [i.e. %sysevalf(1.7 + 4.2, integer)] inside the %sysfunc() [e.g. %sysfunc(aFunction(), int.)]


>>> arrFill() subroutine: <<<

The arrFill() subroutine is a wrapper for the Call Fillmatrix() [a special FCMP subroutine].

A numeric array is filled with selected numeric value, e.g.

for array A = [. . . .] the subroutine call arrFill(42, A) returns A = [42 42 42 42]

Caution! Array parameters to subroutine calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call arrFill(N ,A)

Arguments description:

  1. N - Numeric value.

  2. A - Numeric array.

EXAMPLES AND USECASES:

Example 1.

data _null_;
 array X[*] a b c;

 put "before: " (_all_) (=);
 call arrFill(42, X);
 put "after:  " (_all_) (=);

run;

>>> arrFillC() subroutine: <<<

The arrFillC() subroutine fills a character array with selected character value, e.g.

for array A = [" ", " ", " "] the subroutine call arrFillC("B", A) returns A = ["B", "B", "B"]

Caution! Array parameters to subroutine calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call arrFillC(C ,A)

Arguments description:

  1. C - Character value.

  2. A - Character array.

EXAMPLES AND USECASES:

Example 1.

data _null_;
 array X[*] $ a b c;

 put "before: " (_all_) (=);
 call arrFillC("ABC", X);
 put "after:  " (_all_) (=);

run;

>>> arrMissFill() subroutine: <<<

The arrMissFill() subroutine fills all missing values (i.e. less or equal than .Z) of a numeric array with selected numeric value, e.g.

for array A = [1 . . 4] the subroutine call arrMissFill(42, A) returns A = [1 42 42 4]

Caution! Array parameters to subroutine calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call arrMissFill(N ,A)

Arguments description:

  1. N - Numeric value.

  2. A - Numeric array.

EXAMPLES AND USECASES:

Example 1.

data have;
  input a b c;
cards4;
1 . 3
. 2 .
. . 3
;;;;
run;

data _null_;
 set have ;
 array X[*] a b c;

 put "before: " (_all_) (=);
 call arrMissFill(42, X);
 put "after:  " (_all_) (=);

run;

>>> arrMissFillC() subroutine: <<<

The arrMissFillC() subroutine fills all missing values of a character array with selected character value, e.g.

for array A = ["A", " ", "C"] the subroutine call arrMissFillC("B", A) returns A = ["A", "B", "C"]

Caution! Array parameters to subroutine calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call arrMissFillC(C, A)

Arguments description:

  1. C - Character value.

  2. A - Character array.

EXAMPLES AND USECASES:

Example 1.

data have;
  infile cards dsd dlm="," missover;
  input (a b c) (: $ 1.);
cards4;
A, ,C
 ,B, 
 , ,C
;;;;
run;

data _null_;
 set have ;
 array X[*] $ a b c;

 put "before: " (_all_) (=);
 call arrMissFillC("X", X);
 put "after:  " (_all_) (=);

run;

>>> arrMissToLeft() subroutine: <<<

The arrMissToLeft() subroutine shifts all non-missing (i.e. greater than .Z) numeric elements to the right side of an array and missing values to the left, e.g.

for array A = [1 . 2 . 3] the subroutine call arrMissToLeft(A) returns A = [. . 1 2 3]

All missing values are replaced with the dot (.)

Caution! Array parameters to subroutine calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call arrMissToLeft(A)

Arguments description:

  1. A - Numeric array.

EXAMPLES AND USECASES:

Example 1.

data have;
  input a b c;
cards4;
1 . 3
. 2 .
. . 3
;;;;
run;

data _null_;
 set have ;
 array X[*] a b c;

 put "before: " (_all_) (=);
 call arrMissToLeft(X);
 put "after:  " (_all_) (=);

run;

>>> arrMissToLeftC() subroutine: <<<

The arrMissToLeftC() subroutine shifts all non-missing (i.e. different than empty string) character elements to the right side of an array and all missing values to the left, e.g.

for array A = ["A", " ", "B", " ", "C"] the subroutine call arrMissToLeftC(A) returns A = [" ", " ", "A", "B", "C"]

Caution! Array parameters to subroutine calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call arrMissToLeftC(A)

Arguments description:

  1. A - Character array.

EXAMPLES AND USECASES:

Example 1.

data have;
  infile cards dsd dlm="," missover;
  input (a b c) (: $ 1.);
cards4;
A, ,C
 ,B, 
 , ,C
;;;;
run;

data _null_;
 set have ;
 array X[*] $ a b c;

 put "before: " (_all_) (=);
 call arrMissToLeftC(X);
 put "after:  " (_all_) (=);

run;

>>> arrMissToRight() subroutine: <<<

The arrMissToRight() subroutine shifts all non-missing (i.e. greater than .Z) numeric elements to the left side of an array and missing values to the right, e.g.

for array A = [1 . 2 . 3] the subroutine call arrMissToRight(A) returns A = [1 2 3 . .]

All missing values are replaced with the dot (.)

Caution! Array parameters to subroutine calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call arrMissToRight(A)

Arguments description:

  1. A - Numeric array.

EXAMPLES AND USECASES:

Example 1.

data have;
  input a b c;
cards4;
1 . 3
. 2 .
. . 3
;;;;
run;

data _null_;
 set have ;
 array X[*] a b c;

 put "before: " (_all_) (=);
 call arrMissToRight(X);
 put "after:  " (_all_) (=);

run;

>>> arrMissToRightC() subroutine: <<<

The arrMissToRightC() subroutine shifts all non-missing (i.e. different than empty string) character elements to the left side of an array and missing values to the right, e.g.

for array A = ["A", " ", "B", " ", "C"] the subroutine call arrMissToRightC(A) returns A = ["A", "B", "C", " ", " "]

Caution! Array parameters to subroutine calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call arrMissToRightC(A)

Arguments description:

  1. A - Character array.

EXAMPLES AND USECASES:

Example 1.

data have;
  infile cards dsd dlm="," missover;
  input (a b c) (: $ 1.);
cards4;
A, ,C
 ,B, 
 , ,C
;;;;
run;

data _null_;
 set have ;
 array X[*] $ a b c;

 put "before: " (_all_) (=);
 call arrMissToRightC(X);
 put "after:  " (_all_) (=);

run;

>>> catXFc() function: <<<

The catXFc() function is a wrapper of the catX() function but with ability to format character values.

For array A = ["a", " ", "c"] the catXFc("upcase.", "*", A) returns "A*C".

If format does not handle nulls they are ignored.

Caution! Array parameters to function calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

catXFc(format, delimiter, A)

Arguments description:

  1. format - A name of the character format to be used.

  2. delimiter - A delimiter string to be used.

  3. A - Character array

EXAMPLES AND USECASES:

Example 1.

data _null_;
  t = "t";
  u = " ";
  v = "v";

  array b[*] t u v;

  length s $ 17;
  s = catXFc("upcase.", "*", B);
  put (_all_) (=);
run;

>>> catXFi() function: <<<

The catXFi() function is a wrapper of the catX() function but with ability to format numeric values but IGNORES missing values (i.e. ._, ., .a, ..., .z).

For array A = [0, ., 2] the catXFi("date9.", "#", A) returns "01JAN1960#03JAN1960"

Caution! Array parameters to function calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

catXFi(format, delimiter, A)

Arguments description:

  1. format - A name of the numeric format to be used.

  2. delimiter - A delimiter string to be used.

  3. A - Numeric array

EXAMPLES AND USECASES:

Example 1.

data _null_;
  x = 1;
  y = .;
  z = 3;

  array a[*] x y z;

  length s $ 17;
  s = catXFi("z5.", "#", A);
  put (_all_) (=);
run;

>>> catXFj() function: <<<

The catXFj() function is a wrapper of the catX() function but with ability to format character values.

For array A = ["a", " ", "c"] the catXFj("upcase.", "*", A) returns "A**C"

If format does not handle nulls they are printed as an empty string.

Caution! Array parameters to function calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

catXFj(format, delimiter, A)

Arguments description:

  1. format - A name of the character format to be used.

  2. delimiter - A delimiter string to be used.

  3. A - Character array

EXAMPLES AND USECASES:

Example 1.

data _null_;
  t = "t";
  u = " ";
  v = "v";

  array b[*] t u v;

  length s $ 17;
  s = catXFj("upcase.", "*", B);
  put (_all_) (=);
run;

>>> catXFn() function: <<<

The catXFn() function is a wrapper of the catX() function but with ability to format numeric values.

For array A = [0, 1, 2] the catXFn("date9.", "#", A) returns "01JAN1960#02JAN1960#03JAN1960"

Caution! Array parameters to function calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

catXFn(format, delimiter, A)

Arguments description:

  1. format - A name of the numeric format to be used.

  2. delimiter - A delimiter string to be used.

  3. A - Numeric array

EXAMPLES AND USECASES:

Example 1.

data _null_;
  x = 1;
  y = .;
  z = 3;

  array a[*] x y z;

  length s $ 17;
  s = catXFn("z5.", "#", A);
  put (_all_) (=);
run;

>>> delDataset() function: <<<

The delDataset() function is a "wrapper" for the Fdelete() function. delDataset() function uses a text string with a dataset name as an argument.

Function checks for *.sas7bdat, *.sas7bndx, and *.sas7bvew files and delete them. Return code of 0 means dataset was deleted.

For compound library files are deleted from ALL locations!

Note: Currently only the BASE SAS engine datasets/views are deleted.

Tested on Windows and Linux. Not tested on Z/OS.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

delDataset(lbds_)

Arguments description:

  1. lbds_ - Required, character argument containing name of the dataset/view to be deleted. The _last_ special name is honored.

EXAMPLES AND USECASES:

EXAMPLE 1.

  data TEST1 TEST2(index=(x));
    x = 17;
  run;

  data TEST3 / view=TEST3;
    set test1;
  run;

  data _null_;
    p = delDataset("WORK.TEST1");
    put p=;

    p = delDataset("TEST2");
    put p=;

    p = delDataset("WORK.TEST3");
    put p=;
  run;

Example 2.

  data TEST4;
    x=42;
  run;
  data _null_;
    p = delDataset("_LAST_");
    put p=;
  run;

Example 3.

  options dlcreatedir;
  libname user "%sysfunc(pathname(work))/user";

  data TEST5;
    x=42;
  run;

  data _null_;
    p = delDataset("test5");
    put p=;
  run;

  libname user clear;

Example 4.

  data TEST6;
    x=42;
  run;

  %put *%sysfunc(delDataset(test6))*;

Example 5.

  options dlcreatedir;
  libname L1 "%sysfunc(pathname(work))/L)1";
  libname L2 "%sysfunc(pathname(work))/L(2";
  libname L3 "%sysfunc(pathname(work))/L'3";

  data L1.TEST7 L2.TEST7 L3.TEST7;
    x=42;
  run;

  libname L12 ("%sysfunc(pathname(work))/L(1" "%sysfunc(pathname(work))/L)2");
  libname L1L2 (L2 L3);

  %put *%sysfunc(delDataset(L12.test7))*;
  %put *%sysfunc(delDataset(L1L2.test7))*;

>>> qsortInCbyProcProto() proto function: <<<

The qsortInCbyProcProto() is external C function, this is the implementation of the Quick Sort algorithm.

The function is used internally by functions in the BasePlus package.

Asumptions:

  • smaller subarray is sorted first,
  • subarrays of size < 11 are sorted by insertion sort,
  • pivot is selected as median of low index value, high index value, and (low+high)/2 index value.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!CAUTION! Sorted array CANNOT contains SAS missing values !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

SYNTAX:

The basic syntax is the following:

qsortInCbyProcProto(arr, low, high)

Arguments description:

  1. arr - An array of double type to be sorted.

  2. low - An integer low index of starting position (from which the sorting is done).

  3. high - An integer high index of ending position (up to which the sorting is done).

REFERENCES:

Reference 1.

Insertion sort for arrays smaller then 11 elements:

Based on the code from the following WikiBooks page [2020.08.14]:

https://pl.wikibooks.org/wiki/Kody_%C5%BAr%C3%B3d%C5%82owe/Sortowanie_przez_wstawianie

Reference 2.

Iterative Quick Sort:

Based on the code from the following pages [2020.08.14]:

https://www.geeksforgeeks.org/iterative-quick-sort/

https://www.geeksforgeeks.org/c-program-for-iterative-quick-sort/


>>> fromMissingToNumberBS() function: <<<

The fromMissingToNumberBS() function gets numeric missing value or a number as an argument and returns an integer from 1 to 29.

For a numeric missing argument the returned values are:

  • 1 for ._
  • 2 for .
  • 3 for .a
  • ...
  • 28 for .z and
  • 29 for all other.

The function is used internally by functions in the BasePlus package.

For missing value arguments the function is an inverse of the fromNumberToMissing() function.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

fromMissingToNumberBS(x)

Arguments description:

  1. x - A numeric missing value or a number.

EXAMPLES AND USECASES:

EXAMPLE 1.

  data _null_;
    do x = ._, ., .a, .b, .c, 42;
      y = fromMissingToNumberBS(x);
      put x= y=; 
    end;
  run;

>>> fromNumberToMissing() function: <<<

The fromNumberToMissing() function gets a number as an argument and returns a numeric missing value or zero.

For a numeric argument the returned values are:

  • ._ for 1
  • . for 2
  • .a for 3
  • ...
  • .z for 28 and
  • 0 for all other.

The function is used internally by functions in the BasePlus package.

For arguments 1,2,3, ..., and 28 the function is an inverse of the fromMissingToNumberBS() function.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

fromNumberToMissing(x)

Arguments description:

  1. x - A numeric value.

EXAMPLES AND USECASES:

EXAMPLE 1.

  data _null_;
    do x = 1 to 29;
      y = fromNumberToMissing(x);
      put x= y=; 
    end;
  run;

>>> quickSort4NotMiss() subroutine: <<<

The quickSort4NotMiss() subroutine is an alternative to the CALL SORTN() subroutine for 1-based big arrays (i.e. > 10'000'000 elements) when memory used by call sortn() may be an issue. For smaller arrays the memory footprint is not significant.

The subroutine is based on an iterative quick sort algorithm implemented in the qsortInCbyProcProto() C prototype function.

Caution 1! Array CANNOT contains missing values!

Caution 2! Array parameters to subroutine calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call quickSort4NotMiss(A)

Arguments description:

  1. A - Argument is a 1-based array of NOT missing numeric values.

EXAMPLES AND USECASES:

EXAMPLE 1. For session with 8GB of RAM, array of size 250'000'000 with values in range from 0 to 99'999'999 and NO missing values.

  %let size = 250000000;
  options fullstimer;

  data _null_;
    array test[&size.] _temporary_ ;

    t = time();
    call streaminit(123);
    do _N_ = &size. to 1 by -1;
      test[_N_] = int(100000000*rand("uniform"));
    end;
    t = time() - t;
    put "Array population time: "  t;

    put "First 50 elements before sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;  

    t = time();
    call quickSort4NotMiss (test);
    t = time()-t;
    put "Sorting time: " / t=;

    put; put "First 50 elements after sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;
  run;

Example 2. Resources comparison for session with 8GB of RAM.

Array of size 250'000'000 with random values from 0 to 999'999'999 and NO missing values.

    Array:
      Population time     8.82s
      memory              1'953'470.62k
      OS Memory           1'977'436.00k

    Call quickSort4NotMiss:
      Sorting time        66.92s
      Memory              1'954'683.06k
      OS Memory           1'977'436.00k

    Call quickSortLight:
      Sorting time        70.98s
      Memory              1'955'479.71k
      OS Memory           1'977'436.00k

>>> quickSortHash() subroutine: <<<

The quickSortHash() subroutine is an alternative to the CALL SORTN() subroutine for 1-based big arrays (i.e. > 10'000'000 elements) when memory used by call sortn() may be an issue. For smaller arrays the memory footprint is not significant.

The subroutine is based on an iterative quick sort algorithm implemented in the qsortInCbyProcProto() C prototype function.

The number of "sparse distinct data values" is set to 100'000 to use the hash sort instead of the quick sort. E.g. when number of unique values for sorting is less then 100'000 then an ordered hash table is used to store the data and their count and sort them.

Caution! Array parameters to subroutine calls must be 1-based.

Note! Due to improper memory reporting/releasing for hash tables in FCMP procedure the reported memory used after running the function may not be in line with the RAM memory required for processing.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call quickSortHash(A)

Arguments description:

  1. A - Argument is a 1-based array of numeric values.

EXAMPLES AND USECASES:

EXAMPLE 1. For session with 8GB of RAM Array of size 250'000'000 with values in range from 0 to 99'999'999 and around 10% of various missing values.

  %let size = 250000000;
  options fullstimer;

  data _null_;
    array test[&size.] _temporary_ ;

    array m[0:27] _temporary_ 
      (._ .  .A .B .C .D .E .F .G .H .I .J .K .L 
       .M .N .O .P .Q .R .S .T .U .V .W .X .Y .Z);

    t = time();
    call streaminit(123);
    do _N_ = &size. to 1 by -1;
      _I_ + 1;
      if rand("uniform") > 0.1 then test[_I_] = int(100000000*rand("uniform"));
                               else test[_I_] = m[mod(_N_,28)];
    end;
    t = time() - t;
    put "Array population time: "  t;

    put "First 50 elements before sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;  

    t = time();
    call quickSortHash (test);
    t = time()-t;
    put "Sorting time: " / t=;

    put; put "First 50 elements after sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;
  run;

Example 2. For session with 8GB of RAM Array of size 250'000'000 with values in range from 0 to 9'999 and around 10% of various missing values.

  %let size = 250000000;
  options fullstimer;

  data _null_;
    array test[&size.] _temporary_ ;

    array m[0:27] _temporary_ 
      (._ .  .A .B .C .D .E .F .G .H .I .J .K .L 
       .M .N .O .P .Q .R .S .T .U .V .W .X .Y .Z);

    t = time();
    call streaminit(123);
    do _N_ = &size. to 1 by -1;
      _I_ + 1;
      if rand("uniform") > 0.1 then test[_I_] = int(10000*rand("uniform"));
                               else test[_I_] = m[mod(_N_,28)];
    end;
    t = time() - t;
    put "Array population time: "  t;

    put "First 50 elements before sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;  

    t = time();
    call quickSortHash (test);
    t = time()-t;
    put "Sorting time: " / t=;

    put; put "First 50 elements after sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;
  run;

Example 3. Resources comparison for session with 8GB of RAM

A) Array of size 10'000'000 with random values from 0 to 9'999 range (sparse) and around 10% of missing data.

    Array:
      Population time     0.61s
      Memory              78'468.50k
      OS Memory           101'668.00k

    Call sortn:
      Sorting time        0.87s
      Memory              1'120'261.53k
      OS Memory           1'244'968.00k

    Call quickSortHash:
      Sorting time        6.76s
      Memory              1'222'242.75k(*)
      OS Memory           1'402'920.00k(*)

    Call quickSortLight:
      Sorting time        23.45s
      Memory              80'527.75k
      OS Memory           101'924.00k

B) Array of size 10'000'000 with random values from 0 to 99'999'999 range (dense) and around 10% of missing data.

    Array:
      Population time     0.6s
      Memory              78'463.65k
      OS Memory           101'924.00k

    Call sortn:
      Sorting time        1.51s
      Memory              1'120'253.53k
      OS Memory           1'244'968.00k

    Call quickSortHash:
      Sorting time        6.28s
      Memory              1'222'241.93k(*)
      OS Memory           1'402'920.00k(*)

    Call quickSortLight:
      Sorting time        0.78s
      Memory              80'669.28k
      OS Memory           102'436.00k

C) Array of size 250'000'000 with random values from 0 to 999'999'999 range (dense) and around 10% of missing data.

    Array:
      Population time     15.34s
      memory              1'953'471.81k
      OS Memory           1'977'436.00k

    Call sortn:
      FATAL: Insufficient memory to execute DATA step program. 
             Aborted during the COMPILATION phase.
      ERROR: The SAS System stopped processing this step 
             because of insufficient memory.

    Call quickSortHash:
      Sorting time        124.68s
      Memory              7'573'720.34k(*)
      OS Memory           8'388'448.00k(*)

    Call quickSortLight:
      Sorting time        72.41s
      Memory              1'955'520.78k
      OS Memory           1'977'180.00k

D) Array of size 250'000'000 with random values from 0 to 99'999 range (sparse) and around 10% of missing data.

    Array:
      Population time     16.07
      Memory              1'953'469.78k
      OS Memory           1'977'180.00k

    Call sortn:
      FATAL: Insufficient memory to execute DATA step program. 
             Aborted during the COMPILATION phase.
      ERROR: The SAS System stopped processing this step 
             because of insufficient memory.

    Call quickSortHash:
      Sorting time        123.5s
      Memory              7'573'722.03k
      OS Memory           8'388'448.00k

    Call quickSortLight:
      Sorting time        1'338.25s
      Memory              1'955'529.90k
      OS Memory           1'977'436.00k

(*) When using hash tables in Proc FCMP the RAM usage is not indicated properly. The memory allocation is reported up to the session limit and then reused if needed. The really required memory is in fact much less then reported.


>>> quickSortHashSDDV() subroutine: <<<

The quickSortHashSDDV() subroutine is an alternative to the CALL SORTN() subroutine for 1-based big arrays (i.e. > 10'000'000 elements) when memory used by call sortn() may be an issue. For smaller arrays the memory footprint is not significant.

The subroutine is based on an iterative quick sort algorithm implemented in the qsortInCbyProcProto() C prototype function.

The number of "sparse distinct data values" (argument SDDV) may be adjusted to use the hash sort instead of the quick sort. E.g. when number of unique values for sorting is less then some N then an ordered hash table is used to store the data and their count and sort them.

Caution! Array parameters to subroutine calls must be 1-based.

Note! Due to improper memory reporting/releasing for hash tables in FCMP procedure the report memory used after running the function may not be in line with the RAM memory required for processing.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call quickSortHashSDDV(A, SDDV)

Arguments description:

  1. A - Argument is a 1-based array of numeric values.

  2. SDDV - A number of distinct data values, e.g. 100'000.

EXAMPLES AND USECASES:

EXAMPLE 1. For session with 8GB of RAM Array of size 250'000'000 with values in range from 0 to 99'999'999 and around 10% of various missing values.

  %let size = 250000000;
  options fullstimer;

  data _null_;
    array test[&size.] _temporary_ ;

    array m[0:27] _temporary_ 
      (._ .  .A .B .C .D .E .F .G .H .I .J .K .L 
       .M .N .O .P .Q .R .S .T .U .V .W .X .Y .Z);

    t = time();
    call streaminit(123);
    do _N_ = &size. to 1 by -1;
      _I_ + 1;
      if rand("uniform") > 0.1 then test[_I_] = int(100000000*rand("uniform"));
                               else test[_I_] = m[mod(_N_,28)];
    end;
    t = time() - t;
    put "Array population time: "  t;

    put "First 50 elements before sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;  

    t = time();
    call quickSortHashSDDV (test, 2e4);
    t = time()-t;
    put "Sorting time: " / t=;

    put; put "First 50 elements after sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;
  run;

Example 2. For session with 8GB of RAM Array of size 250'000'000 with values in range from 0 to 9'999 and around 10% of various missing values.

  %let size = 250000000;
  options fullstimer;

  data _null_;
    array test[&size.] _temporary_ ;

    array m[0:27] _temporary_ 
      (._ .  .A .B .C .D .E .F .G .H .I .J .K .L 
       .M .N .O .P .Q .R .S .T .U .V .W .X .Y .Z);

    t = time();
    call streaminit(123);
    do _N_ = &size. to 1 by -1;
      _I_ + 1;
      if rand("uniform") > 0.1 then test[_I_] = int(10000*rand("uniform"));
                               else test[_I_] = m[mod(_N_,28)];
    end;
    t = time() - t;
    put "Array population time: "  t;

    put "First 50 elements before sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;  

    t = time();
    call quickSortHashSDDV (test, 2e4);
    t = time()-t;
    put "Sorting time: " / t=;

    put; put "First 50 elements after sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;
  run;

>>> quickSortLight() subroutine: <<<

The quickSortLight() subroutine is an alternative to the CALL SORTN() subroutine for 1-based big arrays (i.e. > 10'000'000 elements) when memory used by call sortn() may be an issue. For smaller arrays the memory footprint is not significant.

The subroutine is based on an iterative quick sort algorithm implemented in the qsortInCbyProcProto() C prototype function.

Caution! Array parameters to subroutine calls must be 1-based.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

call quickSortLight(A)

Arguments description:

  1. A - Argument is a 1-based array of numeric values.

EXAMPLES AND USECASES:

EXAMPLE 1. For session with 8GB of RAM Array of size 250'000'000 with values in range from 0 to 99'999'999 and around 10% of various missing values.

  %let size = 250000000;
  options fullstimer;

  data _null_;
    array test[&size.] _temporary_ ;

    array m[0:27] _temporary_ 
      (._ .  .A .B .C .D .E .F .G .H .I .J .K .L 
       .M .N .O .P .Q .R .S .T .U .V .W .X .Y .Z);

    t = time();
    call streaminit(123);
    do _N_ = &size. to 1 by -1;
      _I_ + 1;
      if rand("uniform") > 0.1 then test[_I_] = int(100000000*rand("uniform"));
                               else test[_I_] = m[mod(_N_,28)];
    end;
    t = time() - t;
    put "Array population time: "  t;

    put "First 50 elements before sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;  

    t = time();
    call quickSortLight (test);
    t = time()-t;
    put "Sorting time: " / t=;

    put; put "First 50 elements after sorting:";
    do _N_ = 1 to 20;
      put test[_N_] = @;
    end;
  run;

Example 2. Resources comparison for session with 8GB of RAM.

Array of size 250'000'000 with random values from 0 to 999'999'999 and NO missing values.

    Array:
      Population time     8.82s
      memory              1'953'470.62k
      OS Memory           1'977'436.00k

    Call quickSort4NotMiss:
      Sorting time        66.92s
      Memory              1'954'683.06k
      OS Memory           1'977'436.00k

    Call quickSortLight:
      Sorting time        70.98s
      Memory              1'955'479.71k
      OS Memory           1'977'436.00k

Example 3. Resources comparison for session with 8GB of RAM

A) Array of size 10'000'000 with random values from 0 to 9'999 range (sparse) and around 10% of missing data.

    Array:
      Population time     0.61s
      Memory              78'468.50k
      OS Memory           101'668.00k

    Call sortn:
      Sorting time        0.87s
      Memory              1'120'261.53k
      OS Memory           1'244'968.00k

    Call quickSortHash:
      Sorting time        6.76s
      Memory              1'222'242.75k(*)
      OS Memory           1'402'920.00k(*)

    Call quickSortLight:
      Sorting time        23.45s
      Memory              80'527.75k
      OS Memory           101'924.00k

B) Array of size 10'000'000 with random values from 0 to 99'999'999 range (dense) and around 10% of missing data.

    Array:
      Population time     0.6s
      Memory              78'463.65k
      OS Memory           101'924.00k

    Call sortn:
      Sorting time        1.51s
      Memory              1'120'253.53k
      OS Memory           1'244'968.00k

    Call quickSortHash:
      Sorting time        6.28s
      Memory              1'222'241.93k(*)
      OS Memory           1'402'920.00k(*)

    Call quickSortLight:
      Sorting time        0.78s
      Memory              80'669.28k
      OS Memory           102'436.00k

C) Array of size 250'000'000 with random values from 0 to 999'999'999 range (dense) and around 10% of missing data.

    Array:
      Population time     15.34s
      memory              1'953'471.81k
      OS Memory           1'977'436.00k

    Call sortn:
      FATAL: Insufficient memory to execute DATA step program. 
             Aborted during the COMPILATION phase.
      ERROR: The SAS System stopped processing this step 
             because of insufficient memory.

    Call quickSortHash:
      Sorting time        124.68s
      Memory              7'573'720.34k(*)
      OS Memory           8'388'448.00k(*)

    Call quickSortLight:
      Sorting time        72.41s
      Memory              1'955'520.78k
      OS Memory           1'977'180.00k

D) Array of size 250'000'000 with random values from 0 to 99'999 range (sparse) and around 10% of missing data.

    Array:
      Population time     16.07
      Memory              1'953'469.78k
      OS Memory           1'977'180.00k

    Call sortn:
      FATAL: Insufficient memory to execute DATA step program. 
             Aborted during the COMPILATION phase.
      ERROR: The SAS System stopped processing this step 
             because of insufficient memory.

    Call quickSortHash:
      Sorting time        123.5s
      Memory              7'573'722.03k
      OS Memory           8'388'448.00k

    Call quickSortLight:
      Sorting time        1'338.25s
      Memory              1'955'529.90k
      OS Memory           1'977'436.00k

(*) When using hash tables in Proc FCMP the RAM usage is not indicated properly. The memory allocation is reported up to the session limit and then reused if needed. The really required memory is in fact much less then reported.


>>> %dedupListS() macro: <<<

The %dedupListS() macro deletes duplicated values from a SPACE separated list of values. List, including separators, can be no longer than a value carried by a single macrovariable.

Returned value is unquoted.

The %dedupListS() macro executes like a pure macro code.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%dedupListS(
 list of space separated values
)

Arguments description:

  1. list - A list of space separated values.

EXAMPLES AND USECASES:

EXAMPLE 1. Basic use-case one. Delete duplicated values from a list.

  %put *%dedupListS(a b  c   b    c)*;

  %put *%dedupListS(a b,c b,c)*;

  %put *%dedupListS(%str(a b c b c))*;

  %put *%dedupListS(%str(a) %str(b) %str(c) b c)*;

EXAMPLE 2. Macro variable as an argument. Delete duplicated values from a list.

  %let list = 4 5 6 1 2 3 1 2 3 4 5 6;
  %put *%dedupListS(&list.)*;

>>> %dedupListC() macro: <<<

The %dedupListC() macro deletes duplicated values from a COMMA separated list of values. List, including separators, can be no longer than a value carried by a single macrovariable.

Returned value is unquoted. Leading and trailing spaces are ignored.

The %dedupListC() macro executes like a pure macro code.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%dedupListC(
 list,of,comma,separated,values
)

Arguments description:

  1. list - A list of comma separated values.

EXAMPLES AND USECASES:

EXAMPLE 1. Basic use-case one. Delete duplicated values from a list.

  %put *%dedupListC(a,b,c,b,c)*;

  %put *%dedupListC(a,b c,b c)*;

  %put *%dedupListC(%str(a,b,c,b,c))*;

  %put *%dedupListC(%str(a),%str(b),%str(c),b,c)*;

EXAMPLE 2. Leading and trailing spaces are ignored. Delete duplicated values from a list.

  %put *%dedupListC( a , b b ,  c , b b, c    )*;

EXAMPLE 3. Macro variable as an argument. Delete duplicated values from a list.

  %let list = 4, 5, 6, 1, 2, 3, 1, 2, 3, 4, 5, 6;
  %put *%dedupListC(&list.)*;

>>> %dedupListP() macro: <<<

The %dedupListP() macro deletes duplicated values from a PIPE(|) separated list of values. List, including separators, can be no longer than a value carried by a single macrovariable.

Returned value is unquoted. Leading and trailing spaces are ignored.

The %dedupListP() macro executes like a pure macro code.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%dedupListP(
 list|of|pipe|separated|values
)

Arguments description:

  1. list - A list of pipe separated values.

EXAMPLES AND USECASES:

EXAMPLE 1. Basic use-case one. Delete duplicated values from a list.

  %put *%dedupListP(a|b|c|b|c)*;

  %put *%dedupListP(a|b c|b c)*;

  %put *%dedupListP(%str(a|b|c|b|c))*;

  %put *%dedupListP(%str(a)|%str(b)|%str(c)|b|c)*;

EXAMPLE 2. Leading and trailing spaces are ignored. Delete duplicated values from a list.

  %put *%dedupListP( a | b b |  c | b b| c    )*;

EXAMPLE 3. Macro variable as an argument. Delete duplicated values from a list.

  %let list = 4|5|6|1|2|3|1|2|3|4|5|6;
  %put *%dedupListP(&list.)*;

>>> %dedupListX() macro: <<<

The %dedupListX() macro deletes duplicated values from a X separated list of values, where the X represents a single character separator. List, including separators, can be no longer than a value carried by a single macrovariable.

Caution. The value of X has to be in the first byte of the list, just after the opening bracket, i.e. (X...).

Returned value is unquoted. Leading and trailing spaces are ignored.

The %dedupListX() macro executes like a pure macro code.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%dedupListX(
XlistXofXxXseparatedXvalues
)

Arguments description:

  1. list - A list of X separated values.

EXAMPLES AND USECASES:

EXAMPLE 1. Basic use-case one. Delete duplicated values from a list.

  %put *%dedupListX(|a|b|c|b|c)*;

  %put *%dedupListX( a b c b c)*;

  %put *%dedupListX(,a,b,c,b,c)*;

  %put *%dedupListX(XaXbXcXbXc)*;

  %put *%dedupListX(/a/b/c/b/c)*;

  data _null_;
    x = "%dedupListX(%str(;a;b;c;b;c))";
    put x=;
  run;

EXAMPLE 2. Leading and trailing spaces are ignored. Delete duplicated values from a list.

  %put *%dedupListX(| a | b.b |  c | b.b| c    )*;

  %put *%dedupListX(. a . b b .  c . b b. c    )*;

EXAMPLE 3. Macro variable as an argument. Delete duplicated values from a list.

  %let list = 4$5.5$6$1$2$3$1$2$3$4$5.5$6;
  %put *%dedupListX($&list.)*;

  %let list = 4$ 5.5$ 6$ 1$ 2$ 3$ 1$ 2$ 3$ 4$ 5.5$ 6$;
  %put *%dedupListX( &list.)*;

>>> %QdedupListX() macro: <<<

The %QdedupListX() macro deletes duplicated values from a X separated list of values, where the X represents a single character separator. List, including separators, can be no longer than a value carried by a single macrovariable.

Caution. The value of X has to be in the first byte of the list, just after the opening bracket, i.e. (X...).

Returned value is quoted with %superq(). Leading and trailing spaces are ignored.

The %QdedupListX() macro executes like a pure macro code.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%QdedupListX(
XlistXofXxXseparatedXvalues
)

Arguments description:

  1. list - A list of X separated values.

EXAMPLES AND USECASES:

EXAMPLE 1. Basic use-case one. Delete duplicated values from a list.

  %put *%QdedupListX(|a|b|c|b|c)*;

  %put *%QdedupListX( a b c b c)*;

  %put *%QdedupListX(,a,b,c,b,c)*;

  %put *%QdedupListX(XaXbXcXbXc)*;

  %put *%QdedupListX(/a/b/c/b/c)*;

  %put *%QdedupListX(%str(;a;b;c;b;c))*;

  %put *%QdedupListX(%nrstr(&a&b&c&b&c))*;

  %put *%QdedupListX(%nrstr(%a%b%c%b%c))*;

EXAMPLE 2. Leading and trailing spaces are ignored. Delete duplicated values from a list.

  %put *%QdedupListX(| a | b.b |  c | b.b| c    )*;

  %put *%QdedupListX(. a . b b .  c . b b. c    )*;

EXAMPLE 3. Macro variable as an argument. Delete duplicated values from a list.

  %let list = 4$5.5$6$1$2$3$1$2$3$4$5.5$6;
  %put *%QdedupListX($&list.)*;

  %let list = 4$ 5.5$ 6$ 1$ 2$ 3$ 1$ 2$ 3$ 4$ 5.5$ 6$;
  %put *%QdedupListX( &list.)*;

>>> brackets. format: <<<

The brackets format adds brackets around a text or a number. Leading and trailing spaces are dropped before adding brackets.

EXAMPLES AND USECASES:

Example 1.

data _null_;
  input x;
  if x < 0 then put x= brackets.;
           else put x= best32.;
cards;
2
1
0
-1
-2
;
run;

>>> semicolon. format: <<<

The semicolon format adds semicolon after text or number. Leading and trailing spaces are dropped before adding semicolon.

EXAMPLES AND USECASES:

Example 1.

data _null_;
  x = 1;
  y = "A";
  put x= semicolon. y= $semicolon.;
run;

>>> bracketsC() function: <<<

The bracketsC() function is internal function used by the brackets format. Returns character value of length 32767.

SYNTAX:

The basic syntax is the following:

bracketsC(X)

Arguments description:

  1. X - Character value.

>>> bracketsN() function: <<<

The bracketsN() function is internal function used by the brackets format. Returns character value of length 34.

SYNTAX:

The basic syntax is the following:

bracketsN(X)

Arguments description:

  1. X - Numeric value.

>>> semicolonC() function: <<<

The semicolonC() function is internal function used by the semicolon format. Returns character value of length 32767.

SYNTAX:

The basic syntax is the following:

semicolonC(X)

Arguments description:

  1. X - Character value.

>>> semicolonN() function: <<<

The semicolonN() function is internal function used by the semicolon format. Returns character value of length 33.

SYNTAX:

The basic syntax is the following:

semicolonN(X)

Arguments description:

  1. X - Numeric value.

>>> %QzipEvalf() macro: <<<

The zipEvalf() and QzipEvalf() macro functions allow to use a function on elements of pair of space separated lists.

For two space separated lists of text strings the corresponding elements are taken and the macro applies a function, provided by user, to calculate result of the function on taken elements.

When one of the lists is shorter then elements are "reused" starting from the beginning.

The zipEvalf() returns unquoted value [by %unquote()]. The QzipEvalf() returns quoted value [by %superq()].

See examples below for the details.

The %QzipEvalf() macro executes like a pure macro code.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%QzipEvalf(
    first
   ,second
  <,function=>
  <,operator=> 
  <,argBf=>
  <,argMd=>
  <,argAf=>
  <,format=>
)

Arguments description:

  1. first - Required, a space separated list of texts.

  2. second - Required, a space separated list of texts.

  • function = cat - Optional, default value is cat, a function which will be applied to corresponding pairs of elements of the first and the second list.

  • operator = - Optional, default value is empty, arithmetic infix operator used with elements the first and the second list. The first list is used on the left side of the operator the second list is used on the right side of the operator.

  • argBf = - Optional, default value is empty, arguments of the function inserted before elements the first list. If multiple should be comma separated.

  • argMd = - Optional, default value is empty, arguments of the function inserted between elements the first list and the second list. If multiple should be comma separated.

  • argAf = - Optional, default value is empty, arguments of the function inserted after elements the second list. If multiple should be comma separated.

  • format= - Optional, default value is empty, indicates a format which should be used to format the result, does not work when the operator= is used.

EXAMPLES AND USECASES:

See examples in %zipEvalf() help for the details.


>>> %zipEvalf() macro: <<<

The zipEvalf() and QzipEvalf() macro functions allow to use a function on elements of pair of space separated lists.

For two space separated lists of text strings the corresponding elements are taken and the macro applies a function, provided by user, to calculate result of the function on taken elements.

When one of the lists is shorter then elements are "reused" starting from the beginning.

The zipEvalf() returns unquoted value [by %unquote()]. The QzipEvalf() returns quoted value [by %superq()].

See examples below for the details.

The %zipEvalf() macro executes like a pure macro code.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%zipEvalf(
    first
   ,second
  <,function=>
  <,operator=> 
  <,argBf=>
  <,argMd=>
  <,argAf=>
  <,format=>
)

Arguments description:

  1. first - Required, a space separated list of texts.

  2. second - Required, a space separated list of texts.

  • function = cat - Optional, default value is cat, a function which will be applied to corresponding pairs of elements of the first and the second list.

  • operator = - Optional, default value is empty, arithmetic infix operator used with elements the first and the second list. The first list is used on the left side of the operator the second list is used on the right side of the operator.

  • argBf = - Optional, default value is empty, arguments of the function inserted before elements the first list. If multiple should be comma separated.

  • argMd = - Optional, default value is empty, arguments of the function inserted between elements the first list and the second list. If multiple should be comma separated.

  • argAf = - Optional, default value is empty, arguments of the function inserted after elements the second list. If multiple should be comma separated.

  • format= - Optional, default value is empty, indicates a format which should be used to format the result, does not work when the operator= is used.

EXAMPLES AND USECASES:

EXAMPLE 1. Simple concatenation of elements:

%let x = %zipEvalf(1 2 3 4 5 6, q w e r t y);
%put &=x;

EXAMPLE 2. Shorter list is "reused":

%let x = %zipEvalf(1 2 3 4 5 6, a b c);
%put &=x;

EXAMPLE 3. Use of the operator=, shorter list is "reused":

%let y = %zipEvalf(1 2 3 4 5 6, 100 200, operator = +);
%put &=y;

%let z = %zipEvalf(1 2 3 4 5 6 8 9 10, 1 2 3 4 5 6 8 9 10, operator = **);
%put &=z;

EXAMPLE 4. Format result:

%let x = %zipEvalf(1 2 3 4 5 6, q w e r t y, format=$upcase.);
%put &=x;

%put *
%zipEvalf(
 ą ż ś ź ę ć ń ó ł
,Ą Ż Ś Ź Ę Ć Ń Ó Ł
,format = $brackets.
)
*;

EXAMPLE 5. Use with macrovariables:

%let abc = 10 100 1000;
%put *
%zipEvalf(
%str(1 2 3 4 5 6 7 8 9)
,&abc.
,function = sum
)
*;

EXAMPLE 6. If one of elements is empty:

%put *
%zipEvalf(
 abc efg
,
)
*;

EXAMPLE 7. Use of the function=, shorter list is "reused":

%put *
%zipEvalf(
 a b c
,efg
,function = catx
,argBf = %str(,)
,format = $brackets.
)
*;

%put *
%zipEvalf(
 a b c
,efg
,function = catx
,argBf = %str( )
,format = $upcase.
)
*;

%put *
%zipEvalf(
 %str(! @ # $ [ ] % ^ & * )
,1 2 3 4 5 6 7 8 9
,function = catx
,argBf = %str( )
,format = $quote.
)
*;

EXAMPLE 8. Use inside resolve:

data _null_;
z = resolve('
%zipEvalf(
 %nrstr(! @ # $ [ ] % ^ & *)
,1 2 3 4 5 6 7 8 9
,function = catx
,argBf = %str(.)
,format = $quote.
)');
put z=;
run;

EXAMPLE 9. Use in data step:

data test;
  %zipEvalf(
     a b c d e f g 
    ,1 2 3 4 5 6 7
    ,function = catx
    ,argBf    = =
    ,format   = $semicolon.
  )
run;

EXAMPLE 10. With 9.4M6 hashing() function:

%put %zipEvalf(MD5 SHA1 SHA256 SHA384 SHA512 CRC32, abcd, function = HASHING);

EXAMPLE 11. Use middle argument:

%let x = %zipEvalf(1 2 3 4 5 6, 2020, argMd=5, function=MDY, format=date11.);
%put &=x;

License

Copyright (c) 2020 Bartosz Jablonski

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.