[Math][Python] Cartesian product

Sometimes D.Knuth publishes new parts of his TAOCP book(s) (volume IV), URLs like: https://cs.stanford.edu/~knuth/fasc12a.ps.gz

I suspect, some parts has to links from Knuth's website. Also, it's possible that D.Knuth updates some files from time to time.

How to get all possible URLs? In form of 'https ... fasc <number> <letter> .ps.gz'?

Python

#!/usr/bin/env python3

import os

for i in range(0, 30):
    for j in ["", "a", "b", "c", "d", "e", "f", "g", "h"]:
        fname="fasc%d%s.ps.gz" % (i, j)
        URL="https://cs.stanford.edu/~knuth/"+fname
        print (URL)
        #os.system ("wget "+URL)

But I suspect that not only .ps.gz extension used, maybe also .pdf? How to fix my code?

I had to add another for loop:

#!/usr/bin/env python3

import os

for i in range(0, 30):
    for j in ["", "a", "b", "c", "d", "e", "f", "g", "h"]:
        for s in [".ps.gz", ".ps", ".pdf"]:
            fname="fasc%d%s%s" % (i, j, s)
            URL="https://cs.stanford.edu/~knuth/"+fname
            print (URL)
            #os.system ("wget "+URL)

This is non-aesthetic solution. We can do better.

#!/usr/bin/env python3

import os, itertools

lst1=map(str, range(0, 30))
lst2=["", "a", "b", "c", "d", "e", "f", "g", "h"]
lst3=[".ps.gz", ".ps", ".pdf"]

for x in itertools.product(lst1, lst2, lst3):
    fname="fasc" + "".join(x)
    URL="https://cs.stanford.edu/~knuth/"+fname
    print (URL)
    #os.system ("wget "+URL)

This is cartesian product of three lists. Adding another part would be less painful here than adding another for loop.

The results, sorted by date-time (as of Nov-2024):

 583447 Feb 17  2004 fasc1.ps.gz
 521664 Dec 11  2004 fasc2a.ps.gz
 510073 Dec 28  2004 fasc2b.ps.gz
 466648 Apr  1  2005 fasc3a.ps.gz
 638021 Apr  1  2005 fasc3b.ps.gz
 915933 Oct 29  2005 fasc4a.ps.gz
2058428 Oct 29  2005 fasc4b.ps.gz
1363774 Feb 11  2008 fasc0a.ps.gz
 538905 Feb 11  2008 fasc0c.ps.gz
 674387 Feb 11  2008 fasc0b.ps.gz
1132972 Dec 23  2008 fasc1a.ps.gz
1184766 Dec 23  2008 fasc1b.ps.gz
7082867 Sep 24  2015 fasc6a.ps.gz
 728332 Sep  2  2016 fasc5c.pdf
 358784 Sep 12  2019 fasc5a.ps.gz
 469721 Sep 12  2019 fasc5b.ps.gz
5027215 Sep 12  2019 fasc5c.ps.gz
  65111 Jan  2  2020 fasc16a.ps.gz
 489888 Jan 31  2020 fasc9c.pdf
  66685 Mar 27  2020 fasc8b.ps.gz
  61519 May 22  2021 fasc20a.ps.gz
 213204 Mar 28  2022 fasc9c.ps.gz
 156159 Feb  9  2023 fasc14a.ps.gz
 234932 Feb  9  2023 fasc12a.ps.gz
 879955 Feb  9  2024 fasc8a.ps.gz
 369323 May 16  2024 fasc9b.ps.gz
1760317 Nov 23 18:01 fasc7a.ps.gz

Yes, .pdf files are also here.

The results, sorted by filename:

1363774 Feb 11  2008 fasc0a.ps.gz
 674387 Feb 11  2008 fasc0b.ps.gz
 538905 Feb 11  2008 fasc0c.ps.gz
 234932 Feb  9  2023 fasc12a.ps.gz
 156159 Feb  9  2023 fasc14a.ps.gz
  65111 Jan  2  2020 fasc16a.ps.gz
1132972 Dec 23  2008 fasc1a.ps.gz
1184766 Dec 23  2008 fasc1b.ps.gz
 583447 Feb 17  2004 fasc1.ps.gz
  61519 May 22  2021 fasc20a.ps.gz
 521664 Dec 11  2004 fasc2a.ps.gz
 510073 Dec 28  2004 fasc2b.ps.gz
 466648 Apr  1  2005 fasc3a.ps.gz
 638021 Apr  1  2005 fasc3b.ps.gz
 915933 Oct 29  2005 fasc4a.ps.gz
2058428 Oct 29  2005 fasc4b.ps.gz
 358784 Sep 12  2019 fasc5a.ps.gz
 469721 Sep 12  2019 fasc5b.ps.gz
 728332 Sep  2  2016 fasc5c.pdf
5027215 Sep 12  2019 fasc5c.ps.gz
7082867 Sep 24  2015 fasc6a.ps.gz
1760317 Nov 23 18:01 fasc7a.ps.gz
 879955 Feb  9  2024 fasc8a.ps.gz
  66685 Mar 27  2020 fasc8b.ps.gz
 369323 May 16  2024 fasc9b.ps.gz
 489888 Jan 31  2020 fasc9c.pdf
 213204 Mar 28  2022 fasc9c.ps.gz

(My)SQL

MySQL, that is familiar to many, can solve this problem as well, via cross join.

create database tst;
use tst;

create table `numbers`
(
`NUMBER` varchar(16) not null
);

create table `chars`
(
`_CHAR` varchar(16) not null
);

create table `exts`
(
`EXT` varchar(16) not null
);

insert ignore into numbers (NUMBER) values("0");
insert ignore into numbers (NUMBER) values("1");
insert ignore into numbers (NUMBER) values("2");
insert ignore into numbers (NUMBER) values("3");
insert ignore into numbers (NUMBER) values("4");

insert ignore into chars (_CHAR) values("");
insert ignore into chars (_CHAR) values("a");
insert ignore into chars (_CHAR) values("b");
insert ignore into chars (_CHAR) values("c");
insert ignore into chars (_CHAR) values("d");

insert ignore into exts (EXT) values(".ps");
insert ignore into exts (EXT) values(".ps.gz");
insert ignore into exts (EXT) values(".pdf");

select * from numbers cross join chars cross join exts;
+--------+-------+--------+
| NUMBER | _CHAR | EXT    |
+--------+-------+--------+
| 4      |       | .ps    |
| 4      |       | .ps.gz |
| 4      |       | .pdf   |
| 3      |       | .ps    |
| 3      |       | .ps.gz |
| 3      |       | .pdf   |
| 2      |       | .ps    |
| 2      |       | .ps.gz |
| 2      |       | .pdf   |
| 1      |       | .ps    |
| 1      |       | .ps.gz |
| 1      |       | .pdf   |
| 0      |       | .ps    |
| 0      |       | .ps.gz |
| 0      |       | .pdf   |
| 4      | a     | .ps    |
| 4      | a     | .ps.gz |
| 4      | a     | .pdf   |
| 3      | a     | .ps    |
...
| 0      | c     | .ps    |
| 0      | c     | .ps.gz |
| 0      | c     | .pdf   |
| 4      | d     | .ps    |
| 4      | d     | .ps.gz |
| 4      | d     | .pdf   |
| 3      | d     | .ps    |
| 3      | d     | .ps.gz |
| 3      | d     | .pdf   |
| 2      | d     | .ps    |
| 2      | d     | .ps.gz |
| 2      | d     | .pdf   |
| 1      | d     | .ps    |
| 1      | d     | .ps.gz |
| 1      | d     | .pdf   |
| 0      | d     | .ps    |
| 0      | d     | .ps.gz |
| 0      | d     | .pdf   |
+--------+-------+--------+
75 rows in set (0.12 sec)

select concat("https://cs.stanford.edu/~knuth/fasc", NUMBER, _CHAR, EXT) from numbers cross join chars cross join exts;
+-------------------------------------------------------------------+
| concat("https://cs.stanford.edu/~knuth/fasc", NUMBER, _CHAR, EXT) |
+-------------------------------------------------------------------+
| https://cs.stanford.edu/~knuth/fasc4.ps                           |
| https://cs.stanford.edu/~knuth/fasc4.ps.gz                        |
| https://cs.stanford.edu/~knuth/fasc4.pdf                          |
| https://cs.stanford.edu/~knuth/fasc3.ps                           |
| https://cs.stanford.edu/~knuth/fasc3.ps.gz                        |
| https://cs.stanford.edu/~knuth/fasc3.pdf                          |
| https://cs.stanford.edu/~knuth/fasc2.ps                           |
| https://cs.stanford.edu/~knuth/fasc2.ps.gz                        |
| https://cs.stanford.edu/~knuth/fasc2.pdf                          |
...
| https://cs.stanford.edu/~knuth/fasc2d.ps                          |
| https://cs.stanford.edu/~knuth/fasc2d.ps.gz                       |
| https://cs.stanford.edu/~knuth/fasc2d.pdf                         |
| https://cs.stanford.edu/~knuth/fasc1d.ps                          |
| https://cs.stanford.edu/~knuth/fasc1d.ps.gz                       |
| https://cs.stanford.edu/~knuth/fasc1d.pdf                         |
| https://cs.stanford.edu/~knuth/fasc0d.ps                          |
| https://cs.stanford.edu/~knuth/fasc0d.ps.gz                       |
| https://cs.stanford.edu/~knuth/fasc0d.pdf                         |
+-------------------------------------------------------------------+
75 rows in set (0.00 sec)

Google "mysql cross join cartesian product" to find many mentions of that fact.

Wolfram Mathematica

Outer[] function to be used:

Outer[f,list1,list2,…]

gives the generalized outer product of the list_i, forming all possible combinations of the lowest‐level elements in each of them, and feeding them as arguments to f. 

( https://reference.wolfram.com/language/ref/Outer.html )

I would also use Flatten[] a bit:

numbers=Map[ToString,Range[0,3]];

chars=AppendTo[Map[FromCharacterCode[#+ToCharacterCode["a"]]&,Range[0,3]],""];

ext={".ps.gz",".ps",".pdf"};

x=Outer[{#1,#2,#3}&,numbers,chars,ext]
Out[39]= {{{{0, a, .ps.gz}, {0, a, .ps}, {0, a, .pdf}}, 
   {{0, b, .ps.gz}, {0, b, .ps}, {0, b, .pdf}}, 
   {{0, c, .ps.gz}, {0, c, .ps}, {0, c, .pdf}}, 
   {{0, d, .ps.gz}, {0, d, .ps}, {0, d, .pdf}}, 
   {{0, , .ps.gz}, {0, , .ps}, {0, , .pdf}}}, 
  {{{1, a, .ps.gz}, {1, a, .ps}, {1, a, .pdf}}, 
   {{1, b, .ps.gz}, {1, b, .ps}, {1, b, .pdf}}, 
   {{1, c, .ps.gz}, {1, c, .ps}, {1, c, .pdf}}, 
   {{1, d, .ps.gz}, {1, d, .ps}, {1, d, .pdf}}, 
   {{1, , .ps.gz}, {1, , .ps}, {1, , .pdf}}}, 
  {{{2, a, .ps.gz}, {2, a, .ps}, {2, a, .pdf}}, 
   {{2, b, .ps.gz}, {2, b, .ps}, {2, b, .pdf}}, 
   {{2, c, .ps.gz}, {2, c, .ps}, {2, c, .pdf}}, 
   {{2, d, .ps.gz}, {2, d, .ps}, {2, d, .pdf}}, 
   {{2, , .ps.gz}, {2, , .ps}, {2, , .pdf}}}, 
  {{{3, a, .ps.gz}, {3, a, .ps}, {3, a, .pdf}}, 
   {{3, b, .ps.gz}, {3, b, .ps}, {3, b, .pdf}}, 
   {{3, c, .ps.gz}, {3, c, .ps}, {3, c, .pdf}}, 
   {{3, d, .ps.gz}, {3, d, .ps}, {3, d, .pdf}}, 
   {{3, , .ps.gz}, {3, , .ps}, {3, , .pdf}}}}

In[40]:= y=Flatten[x,2]
Out[40]= {{0, a, .ps.gz}, {0, a, .ps}, {0, a, .pdf}, 
  {0, b, .ps.gz}, {0, b, .ps}, {0, b, .pdf}, 
  {0, c, .ps.gz}, {0, c, .ps}, {0, c, .pdf}, 
  {0, d, .ps.gz}, {0, d, .ps}, {0, d, .pdf}, 
  {0, , .ps.gz}, {0, , .ps}, {0, , .pdf}, 
  {1, a, .ps.gz}, {1, a, .ps}, {1, a, .pdf}, 
  {1, b, .ps.gz}, {1, b, .ps}, {1, b, .pdf}, 
  {1, c, .ps.gz}, {1, c, .ps}, {1, c, .pdf}, 
  {1, d, .ps.gz}, {1, d, .ps}, {1, d, .pdf}, 
  {1, , .ps.gz}, {1, , .ps}, {1, , .pdf}, 
  {2, a, .ps.gz}, {2, a, .ps}, {2, a, .pdf}, 
  {2, b, .ps.gz}, {2, b, .ps}, {2, b, .pdf}, 
  {2, c, .ps.gz}, {2, c, .ps}, {2, c, .pdf}, 
  {2, d, .ps.gz}, {2, d, .ps}, {2, d, .pdf}, 
  {2, , .ps.gz}, {2, , .ps}, {2, , .pdf}, 
  {3, a, .ps.gz}, {3, a, .ps}, {3, a, .pdf}, 
  {3, b, .ps.gz}, {3, b, .ps}, {3, b, .pdf}, 
  {3, c, .ps.gz}, {3, c, .ps}, {3, c, .pdf}, 
  {3, d, .ps.gz}, {3, d, .ps}, {3, d, .pdf}, 
  {3, , .ps.gz}, {3, , .ps}, {3, , .pdf}}

In[48]:= Map[{"https://cs.stanford.edu/~knuth/fasc"<>#1}&, y]
Out[48]= {{https://cs.stanford.edu/~knuth/fasc0a.ps.gz}, 
  {https://cs.stanford.edu/~knuth/fasc0a.ps}, 
  {https://cs.stanford.edu/~knuth/fasc0a.pdf}, 
  {https://cs.stanford.edu/~knuth/fasc0b.ps.gz}, 
  {https://cs.stanford.edu/~knuth/fasc0b.ps}, 
  {https://cs.stanford.edu/~knuth/fasc0b.pdf}, 
  {https://cs.stanford.edu/~knuth/fasc0c.ps.gz}, 
  {https://cs.stanford.edu/~knuth/fasc0c.ps}, 
  {https://cs.stanford.edu/~knuth/fasc0c.pdf}, 
  {https://cs.stanford.edu/~knuth/fasc0d.ps.gz}, 
  {https://cs.stanford.edu/~knuth/fasc0d.ps}, 
  {https://cs.stanford.edu/~knuth/fasc0d.pdf}, 
  {https://cs.stanford.edu/~knuth/fasc0.ps.gz}, 
  {https://cs.stanford.edu/~knuth/fasc0.ps}, 
  {https://cs.stanford.edu/~knuth/fasc0.pdf}, 
...
  {https://cs.stanford.edu/~knuth/fasc3c.ps}, 
  {https://cs.stanford.edu/~knuth/fasc3c.pdf}, 
  {https://cs.stanford.edu/~knuth/fasc3d.ps.gz}, 
  {https://cs.stanford.edu/~knuth/fasc3d.ps}, 
  {https://cs.stanford.edu/~knuth/fasc3d.pdf}, 
  {https://cs.stanford.edu/~knuth/fasc3.ps.gz}, 
  {https://cs.stanford.edu/~knuth/fasc3.ps}, 
  {https://cs.stanford.edu/~knuth/fasc3.pdf}}

(the post first published at 20241129.)


List of my other blog posts.

Subscribe to my news feed,

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.