NAME
fetchMakeURL,
fetchParseURL,
fetchCopyURL,
fetchFreeURL,
fetchXGetURL,
fetchGetURL,
fetchPutURL,
fetchStatURL,
fetchListURL,
fetchXGet,
fetchGet,
fetchPut,
fetchStat,
fetchList,
fetchXGetFile,
fetchGetFile,
fetchPutFile,
fetchStatFile,
fetchListFile,
fetchXGetHTTP,
fetchGetHTTP,
fetchPutHTTP,
fetchStatHTTP,
fetchListHTTP,
fetchXGetFTP,
fetchGetFTP,
fetchPutFTP,
fetchStatFTP,
fetchListFTP fetchInitURLList,
fetchFreeURLList,
fetchUnquotePath,
fetchUnquoteFilename,
fetchStringifyURL,
fetchConnectionCacheInit,
fetchConnectionCacheClose,
fetch —
file transfer functions
LIBRARY
File Transfer Library (libfetch, -lfetch)
SYNOPSIS
#include <stdio.h>
#include <fetch.h>
struct url *
fetchMakeURL(
const
char *scheme,
const char
*host,
int port,
const char *doc,
const char *user,
const char *pwd);
struct url *
fetchParseURL(
const
char *URL);
struct url *
fetchCopyURL(
const
struct url *u);
void
fetchFreeURL(
struct
url *u);
fetchIO *
fetchXGetURL(
const
char *URL,
struct url_stat
*us,
const char
*flags);
fetchIO *
fetchGetURL(
const
char *URL,
const char
*flags);
fetchIO *
fetchPutURL(
const
char *URL,
const char
*flags);
int
fetchStatURL(
const
char *URL,
struct url_stat
*us,
const char
*flags);
int
fetchListURL(
struct
url_list *list,
const char
*URL,
const char
*pattern,
const char
*flags);
fetchIO *
fetchXGet(
struct
url *u,
struct url_stat
*us,
const char
*flags);
fetchIO *
fetchGet(
struct
url *u,
const char
*flags);
fetchIO *
fetchPut(
struct
url *u,
const char
*flags);
int
fetchStat(
struct
url *u,
struct url_stat
*us,
const char
*flags);
int
fetchList(
struct
url_list *list,
struct url
*u,
const char
*pattern,
const char
*flags);
fetchIO *
fetchXGetFile(
struct
url *u,
struct url_stat
*us,
const char
*flags);
fetchIO *
fetchGetFile(
struct
url *u,
const char
*flags);
fetchIO *
fetchPutFile(
struct
url *u,
const char
*flags);
int
fetchStatFile(
struct
url *u,
struct url_stat
*us,
const char
*flags);
int
fetchListFile(
struct
url_list *list,
struct url
*u,
const char
*pattern,
const char
*flags);
fetchIO *
fetchXGetHTTP(
struct
url *u,
struct url_stat
*us,
const char
*flags);
fetchIO *
fetchGetHTTP(
struct
url *u,
const char
*flags);
fetchIO *
fetchPutHTTP(
struct
url *u,
const char
*flags);
int
fetchStatHTTP(
struct
url *u,
struct url_stat
*us,
const char
*flags);
int
fetchListHTTP(
struct
url_list *list,
struct url
*u,
const char
*pattern,
const char
*flags);
fetchIO *
fetchXGetFTP(
struct
url *u,
struct url_stat
*us,
const char
*flags);
fetchIO *
fetchGetFTP(
struct
url *u,
const char
*flags);
fetchIO *
fetchPutFTP(
struct
url *u,
const char
*flags);
int
fetchStatFTP(
struct
url *u,
struct url_stat
*us,
const char
*flags);
int
fetchListFTP(
struct
url_list *list,
struct url
*u,
const char
*pattern,
const char
*flags);
void
fetchInitURLList(
struct
url_list *ul);
int
fetchAppendURLList(
struct
url_list *dst,
const struct
url_list *src);
void
fetchFreeURLList(
struct
url_list *ul);
char *
fetchUnquotePath(
struct
url *u);
char *
fetchUnquoteFilename(
struct
url *u);
char *
fetchStringifyURL(
const
struct url *u);
void
fetchConnectionCacheInit(
int
global,
int
per_host);
void
fetchConnectionCacheClose(
void);
DESCRIPTION
These functions implement a high-level library for retrieving and uploading
files using Uniform Resource Locators (URLs).
fetchParseURL() takes a URL in the form of a null-terminated
string and splits it into its components function according to the Common
Internet Scheme Syntax detailed in RFC 1738. A regular expression which
produces this syntax is:
<scheme>:(//(<user>(:<pwd>)?@)?<host>(:<port>)?)?/(<doc>)?
If the URL does not seem to begin with a scheme name, it is assumed to be a
local path. Only absolute path names are accepted.
Note that some components of the URL are not necessarily relevant to all URL
schemes. For instance, the file scheme only needs the ⟨scheme⟩
and ⟨doc⟩ components.
fetchParseURL() quotes
any unsafe character in the URL automatically. This is not done by
fetchMakeURL().
fetchCopyURL() copies an
existing
url structure.
fetchMakeURL(),
fetchParseURL(), and
fetchCopyURL() return a pointer to a
url structure, which is defined as follows in
<fetch.h>:
#define URL_SCHEMELEN 16
#define URL_USERLEN 256
#define URL_PWDLEN 256
#define URL_HOSTLEN 255
struct url {
char scheme[URL_SCHEMELEN + 1];
char user[URL_USERLEN + 1];
char pwd[URL_PWDLEN + 1];
char host[URL_HOSTLEN + 1];
int port;
char *doc;
off_t offset;
size_t length;
time_t last_modified;
};
The pointer returned by
fetchMakeURL(),
fetchCopyURL(), and
fetchParseURL() should
be freed using
fetchFreeURL(). The size of
struct URL is not part of the ABI.
fetchXGetURL(),
fetchGetURL(), and
fetchPutURL() constitute the recommended interface to the
fetch library. They examine the URL passed to them to
determine the transfer method, and call the appropriate lower-level functions
to perform the actual transfer.
fetchXGetURL() also returns
the remote document's metadata in the
url_stat structure
pointed to by the
us argument.
The
flags argument is a string of characters which specify
transfer options. The meaning of the individual flags is scheme-dependent, and
is detailed in the appropriate section below.
fetchStatURL() attempts to obtain the requested document's
metadata and fill in the structure pointed to by its second argument. The
url_stat structure is defined as follows in
<fetch.h>:
struct url_stat {
off_t size;
time_t atime;
time_t mtime;
};
If the size could not be obtained from the server, the
size field is set to -1. If the modification time could
not be obtained from the server, the
mtime field is set
to the epoch. If the access time could not be obtained from the server, the
atime field is set to the modification time.
fetchListURL() attempts to list the contents of the directory
pointed to by the URL provided. The pattern can be a simple glob-like
expression as hint. Callers should not depend on the server to filter names.
If successful, it appends the list of entries to the
url_list structure. The
url_list
structure is defined as follows in
<fetch.h>:
struct url_list {
size_t length;
size_t alloc_size;
struct url *urls;
};
The list should be initialized by calling
fetchInitURLList()
and the entries be freed by calling
fetchFreeURLList(). The
function
fetchAppendURLList() can be used to append one URL
lists to another. If the ‘
c
’ (cache
result) flag is specified, the library is allowed to internally cache the
result.
fetchStringifyURL() returns the URL as string.
fetchUnquotePath() returns the path name part of the URL
with any quoting undone. Query arguments and fragment identifiers are not
included.
fetchUnquoteFilename() returns the last component
of the path name as returned by
fetchUnquotePath().
fetchStringifyURL(),
fetchUnquotePath(),
and
fetchUnquoteFilename() return a string that should be
deallocated with
free() after use.
fetchConnectionCacheInit() enables the connection cache. The
first argument specifies the global limit on cached connections. The second
argument specifies the host limit. Entries are considered to specify the same
host, if the host name from the URL is identical, indepent of the address or
address family.
fetchConnectionCacheClose() flushed the
connection cache and closes all cached connections.
fetchXGet(),
fetchGet(),
fetchPut(), and
fetchStat() are similar to
fetchXGetURL(),
fetchGetURL(),
fetchPutURL(), and
fetchStatURL(), except
that they expect a pre-parsed URL in the form of a pointer to a
struct url rather than a string.
All of the
fetchXGetXXX(),
fetchGetXXX(),
and
fetchPutXXX() functions return a pointer to a stream
which can be used to read or write data from or to the requested document,
respectively. Note that although the implementation details of the individual
access methods vary, it can generally be assumed that a stream returned by one
of the
fetchXGetXXX() or
fetchGetXXX()
functions is read-only, and that a stream returned by one of the
fetchPutXXX() functions is write-only.
PROTOCOL INDEPENDENT FLAGS
If the ‘
i
’ (if-modified-since) flag is
specified, the library will try to fetch the content only if it is newer than
last_modified. For HTTP an
If-Modified-Since
HTTP header is sent. For FTP a
MTDM
command is sent first and compared locally. For
FILE the source file is compared.
FILE SCHEME
fetchXGetFile(),
fetchGetFile(), and
fetchPutFile() provide access to documents which are files
in a locally mounted file system. Only the ⟨document⟩ component
of the URL is used.
fetchXGetFile() and
fetchGetFile() do not
accept any flags.
fetchPutFile() accepts the
‘
a
’ (append to file) flag. If that flag is
specified, the data written to the stream returned by
fetchPutFile() will be appended to the previous contents of
the file, instead of replacing them.
FTP SCHEME
fetchXGetFTP(),
fetchGetFTP(), and
fetchPutFTP() implement the FTP protocol as described in RFC
959.
By default
libfetch will attempt to use passive mode first and
only fallback to active mode if the server reports a syntax error. If the
‘
a
’ (active) flag is specified, a passive
connection is not tried and active mode is used directly.
If the ‘
l
’ (low) flag is specified, data
sockets will be allocated in the low (or default) port range instead of the
high port range (see
ip(4)).
If the ‘
d
’ (direct) flag is specified,
fetchXGetFTP(),
fetchGetFTP(), and
fetchPutFTP() will use a direct connection even if a proxy
server is defined.
If no user name or password is given, the
fetch library will
attempt an anonymous login, with user name "anonymous" and password
"anonymous@<hostname>".
HTTP SCHEME
The
fetchXGetHTTP(),
fetchGetHTTP(), and
fetchPutHTTP() functions implement the HTTP/1.1 protocol.
With a little luck, there is even a chance that they comply with RFC 2616 and
RFC 2617.
If the ‘
d
’ (direct) flag is specified,
fetchXGetHTTP(),
fetchGetHTTP(), and
fetchPutHTTP() will use a direct connection even if a proxy
server is defined.
Since there seems to be no good way of implementing the HTTP PUT method in a
manner consistent with the rest of the
fetch library,
fetchPutHTTP() is currently unimplemented.
AUTHENTICATION
Apart from setting the appropriate environment variables and specifying the user
name and password in the URL or the
struct url, the
calling program has the option of defining an authentication function with the
following prototype:
int myAuthMethod(
struct
url *u)
The callback function should fill in the
user and
pwd fields in the provided
struct
url and return 0 on success, or any other value to indicate failure.
To register the authentication callback, simply set
fetchAuthMethod to point at it. The callback will be
used whenever a site requires authentication and the appropriate environment
variables are not set.
This interface is experimental and may be subject to change.
RETURN VALUES
fetchParseURL() returns a pointer to a
struct
url containing the individual components of the URL. If it is unable to
allocate memory, or the URL is syntactically incorrect,
fetchParseURL() returns a
NULL
pointer.
The
fetchStat() functions return 0 on success and -1 on
failure.
All other functions return a stream pointer which may be used to access the
requested document, or
NULL
if an error occurred.
The following error codes are defined in
<fetch.h>:
-
-
- [
FETCH_ABORT
]
- Operation aborted
-
-
- [
FETCH_AUTH
]
- Authentication failed
-
-
- [
FETCH_DOWN
]
- Service unavailable
-
-
- [
FETCH_EXISTS
]
- File exists
-
-
- [
FETCH_FULL
]
- File system full
-
-
- [
FETCH_INFO
]
- Informational response
-
-
- [
FETCH_MEMORY
]
- Insufficient memory
-
-
- [
FETCH_MOVED
]
- File has moved
-
-
- [
FETCH_NETWORK
]
- Network error
-
-
- [
FETCH_OK
]
- No error
-
-
- [
FETCH_PROTO
]
- Protocol error
-
-
- [
FETCH_RESOLV
]
- Resolver error
-
-
- [
FETCH_SERVER
]
- Server error
-
-
- [
FETCH_TEMP
]
- Temporary error
-
-
- [
FETCH_TIMEOUT
]
- Operation timed out
-
-
- [
FETCH_UNAVAIL
]
- File is not available
-
-
- [
FETCH_UNKNOWN
]
- Unknown error
-
-
- [
FETCH_URL
]
- Invalid URL
The accompanying error message includes a protocol-specific error code and
message, e.g. "File is not available (404 Not Found)"
ENVIRONMENT
-
-
FETCH_BIND_ADDRESS
- Specifies a host name or IP address to which sockets used
for outgoing connections will be bound.
-
-
FTP_LOGIN
- Default FTP login if none was provided in the URL.
-
-
FTP_PASSIVE_MODE
- If set to anything but
‘
no
’, forces the FTP code to use
passive mode.
-
-
FTP_PASSWORD
- Default FTP password if the remote server requests one and
none was provided in the URL.
-
-
FTP_PROXY
- URL of the proxy to use for FTP requests. The document part
is ignored. FTP and HTTP proxies are supported; if no scheme is specified,
FTP is assumed. If the proxy is an FTP proxy, libfetch
will send ‘
user@host
’ as user name to
the proxy, where ‘user
’ is the real
user name, and ‘host
’ is the name of
the FTP server.
If this variable is set to an empty string, no proxy will be used for FTP
requests, even if the HTTP_PROXY
variable is
set.
-
-
ftp_proxy
- Same as
FTP_PROXY
, for
compatibility.
-
-
HTTP_AUTH
- Specifies HTTP authorization parameters as a
colon-separated list of items. The first and second item are the
authorization scheme and realm respectively; further items are
scheme-dependent. Currently, only basic authorization is supported.
Basic authorization requires two parameters: the user name and password, in
that order.
This variable is only used if the server requires authorization and no user
name or password was specified in the URL.
-
-
HTTP_PROXY
- URL of the proxy to use for HTTP requests. The document
part is ignored. Only HTTP proxies are supported for HTTP requests. If no
port number is specified, the default is 3128.
Note that this proxy will also be used for FTP documents, unless the
FTP_PROXY
variable is set.
-
-
http_proxy
- Same as
HTTP_PROXY
, for
compatibility.
-
-
HTTP_PROXY_AUTH
- Specifies authorization parameters for the HTTP proxy in
the same format as the
HTTP_AUTH
variable.
This variable is used if and only if connected to an HTTP proxy, and is
ignored if a user and/or a password were specified in the proxy URL.
-
-
HTTP_REFERER
- Specifies the referrer URL to use for HTTP requests. If set
to “auto”, the document URL will be used as referrer URL.
-
-
HTTP_USER_AGENT
- Specifies the User-Agent string to use for HTTP requests.
This can be useful when working with HTTP origin or proxy servers that
differentiate between user agents.
-
-
NETRC
- Specifies a file to use instead of
~/.netrc to look up login names and passwords for FTP
sites. See ftp(1) for a
description of the file format. This feature is experimental.
-
-
NO_PROXY
- Either a single asterisk, which disables the use of proxies
altogether, or a comma- or whitespace-separated list of hosts for which
proxies should not be used.
-
-
no_proxy
- Same as
NO_PROXY
, for
compatibility.
EXAMPLES
To access a proxy server on
proxy.example.com port 8080, set
the
HTTP_PROXY
environment variable in a manner
similar to this:
HTTP_PROXY=http://proxy.example.com:8080
If the proxy server requires authentication, there are two options available for
passing the authentication data. The first method is by using the proxy URL:
HTTP_PROXY=http://<user>:<pwd>@proxy.example.com:8080
The second method is by using the
HTTP_PROXY_AUTH
environment variable:
HTTP_PROXY=http://proxy.example.com:8080
HTTP_PROXY_AUTH=basic:*:<user>:<pwd>
To disable the use of a proxy for an HTTP server running on the local host,
define
NO_PROXY
as follows:
NO_PROXY=localhost,127.0.0.1
SEE ALSO
ftp(1),
ip(4)
J. Postel and J. K.
Reynolds, File Transfer Protocol,
October 1985, RFC
959.
P. Deutsch, A.
Emtage, and A. Marine, How
to Use Anonymous FTP, May 1994,
RFC 1635.
T. Berners-Lee, L.
Masinter, and M. McCahill,
Uniform Resource Locators (URL),
December 1994, RFC
1738.
R. Fielding, J.
Gettys, J. Mogul, H.
Frystyk, L. Masinter, P.
Leach, and T. Berners-Lee,
Hypertext Transfer Protocol -- HTTP/1.1,
January 1999, RFC
2616.
J. Franks, P.
Hallam-Baker, J. Hostetler,
S. Lawrence, P. Leach,
A. Luotonen, and L. Stewart,
HTTP Authentication: Basic and Digest Access
Authentication, June 1999, RFC
2617.
HISTORY
The
fetch library first appeared in
FreeBSD
3.0.
AUTHORS
The
fetch library was mostly written by
Dag-Erling Smørgrav
<
des@FreeBSD.org> with
numerous suggestions from
Jordan K. Hubbard
<
jkh@FreeBSD.org>,
Eugene Skepner
<
eu@qub.com> and other
FreeBSD developers. It replaces the older
ftpio library written by
Poul-Henning
Kamp
<
phk@FreeBSD.org> and
Jordan K. Hubbard
<
jkh@FreeBSD.org>.
This manual page was written by
Dag-Erling Smørgrav
<
des@FreeBSD.org>.
BUGS
Some parts of the library are not yet implemented. The most notable examples of
this are
fetchPutHTTP() and FTP proxy support.
There is no way to select a proxy at run-time other than setting the
HTTP_PROXY
or
FTP_PROXY
environment variables as appropriate.
libfetch does not understand or obey 305 (Use Proxy) replies.
Error numbers are unique only within a certain context; the error codes used for
FTP and HTTP overlap, as do those used for resolver and system errors. For
instance, error code 202 means "Command not implemented, superfluous at
this site" in an FTP context and "Accepted" in an HTTP context.
fetchStatFTP() does not check that the result of an MDTM
command is a valid date.
The man page is incomplete, poorly written and produces badly formatted text.
The error reporting mechanism is unsatisfactory.
Some parts of the code are not fully reentrant.