From: Mark Gaiser <markg85@gmail.com> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH v5 1/1] avformat: Add IPFS protocol support. Date: Sat, 12 Feb 2022 19:44:47 +0100 Message-ID: <CAPd6JnGhhgKAFgviudWx_tkK-0mU6B+vDsWAkUEDv7aQW22E6g@mail.gmail.com> (raw) In-Reply-To: <CAPd6JnGB++sE4CZj+6NGDaTToi08xWNRZsAFPtMEy8niLA9jhw@mail.gmail.com> On Sat, Feb 12, 2022 at 6:57 PM Mark Gaiser <markg85@gmail.com> wrote: > > > On Sat, Feb 12, 2022 at 1:19 PM Tomas Härdin <tjoppen@acc.umu.se> wrote: > >> tor 2022-02-10 klockan 02:13 +0100 skrev Mark Gaiser: >> > This patch adds support for: >> > - ffplay ipfs://<cid> >> > - ffplay ipns://<cid> >> > >> > IPFS data can be played from so called "ipfs gateways". >> > A gateway is essentially a webserver that gives access to the >> > distributed IPFS network. >> > >> > This protocol support (ipfs and ipns) therefore translates >> > ipfs:// and ipns:// to a http:// url. This resulting url is >> > then handled by the http protocol. It could also be https >> > depending on the gateway provided. >> > >> > To use this protocol, a gateway must be provided. >> > If you do nothing it will try to find it in your >> > $HOME/.ipfs/gateway file. The ways to set it manually are: >> > 1. Define a -gateway <url> to the gateway. >> > 2. Define $IPFS_GATEWAY with the full http link to the gateway. >> > 3. Define $IPFS_PATH and point it to the IPFS data path. >> > 4. Have IPFS running in your local user folder (under $HOME/.ipfs). >> > >> > Signed-off-by: Mark Gaiser <markg85@gmail.com> >> > --- >> > configure | 2 + >> > doc/protocols.texi | 30 ++++ >> > libavformat/Makefile | 2 + >> > libavformat/ipfsgateway.c | 326 >> > ++++++++++++++++++++++++++++++++++++++ >> > libavformat/protocols.c | 2 + >> > 5 files changed, 362 insertions(+) >> > create mode 100644 libavformat/ipfsgateway.c >> > >> > diff --git a/configure b/configure >> > index 5b19a35f59..6ff09e7974 100755 >> > --- a/configure >> > +++ b/configure >> > @@ -3585,6 +3585,8 @@ udp_protocol_select="network" >> > udplite_protocol_select="network" >> > unix_protocol_deps="sys_un_h" >> > unix_protocol_select="network" >> > +ipfs_protocol_select="https_protocol" >> > +ipns_protocol_select="https_protocol" >> > >> > # external library protocols >> > libamqp_protocol_deps="librabbitmq" >> > diff --git a/doc/protocols.texi b/doc/protocols.texi >> > index d207df0b52..7c9c0a4808 100644 >> > --- a/doc/protocols.texi >> > +++ b/doc/protocols.texi >> > @@ -2025,5 +2025,35 @@ decoding errors. >> > >> > @end table >> > >> > +@section ipfs >> > + >> > +InterPlanetary File System (IPFS) protocol support. One can access >> > files stored >> > +on the IPFS network through so called gateways. Those are http(s) >> > endpoints. >> > +This protocol wraps the IPFS native protocols (ipfs:// and ipns://) >> > to be send >> > +to such a gateway. Users can (and should) host their own node which >> > means this >> > +protocol will use your local machine gateway to access files on the >> > IPFS network. >> > + >> > +If a user doesn't have a node of their own then the public gateway >> > dweb.link is >> > +used by default. >> > + >> > +You can use this protocol in 2 ways. Using IPFS: >> > +@example >> > +ffplay ipfs://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T >> > +@end example >> > + >> > +Or the IPNS protocol (IPNS is mutable IPFS): >> > +@example >> > +ffplay ipns://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T >> > +@end example >> > + >> > +You can also change the gateway to be used: >> > + >> > +@table @option >> > + >> > +@item gateway >> > +Defines the gateway to use. When nothing is provided the protocol >> > will first try >> > +your local gateway. If that fails dweb.link will be used. >> > + >> > +@end table >> > >> > @c man end PROTOCOLS >> > diff --git a/libavformat/Makefile b/libavformat/Makefile >> > index 3dc6a479cc..4edce8420f 100644 >> > --- a/libavformat/Makefile >> > +++ b/libavformat/Makefile >> > @@ -656,6 +656,8 @@ OBJS-$(CONFIG_SRTP_PROTOCOL) += >> > srtpproto.o srtp.o >> > OBJS-$(CONFIG_SUBFILE_PROTOCOL) += subfile.o >> > OBJS-$(CONFIG_TEE_PROTOCOL) += teeproto.o tee_common.o >> > OBJS-$(CONFIG_TCP_PROTOCOL) += tcp.o >> > +OBJS-$(CONFIG_IPFS_PROTOCOL) += ipfsgateway.o >> > +OBJS-$(CONFIG_IPNS_PROTOCOL) += ipfsgateway.o >> > TLS-OBJS-$(CONFIG_GNUTLS) += tls_gnutls.o >> > TLS-OBJS-$(CONFIG_LIBTLS) += tls_libtls.o >> > TLS-OBJS-$(CONFIG_MBEDTLS) += tls_mbedtls.o >> > diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c >> > new file mode 100644 >> > index 0000000000..9ebced10c7 >> > --- /dev/null >> > +++ b/libavformat/ipfsgateway.c >> > @@ -0,0 +1,326 @@ >> > +/* >> > + * IPFS and IPNS protocol support through IPFS Gateway. >> > + * Copyright (c) 2022 Mark Gaiser >> > + * >> > + * This file is part of FFmpeg. >> > + * >> > + * FFmpeg is free software; you can redistribute it and/or >> > + * modify it under the terms of the GNU Lesser General Public >> > + * License as published by the Free Software Foundation; either >> > + * version 2.1 of the License, or (at your option) any later >> > version. >> > + * >> > + * FFmpeg is distributed in the hope that it will be useful, >> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of >> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> > + * Lesser General Public License for more details. >> > + * >> > + * You should have received a copy of the GNU Lesser General Public >> > + * License along with FFmpeg; if not, write to the Free Software >> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA >> > 02110-1301 USA >> > + */ >> > + >> > +#include "avformat.h" >> > +#include "libavutil/avassert.h" >> > +#include "libavutil/avstring.h" >> > +#include "libavutil/internal.h" >> > +#include "libavutil/opt.h" >> > +#include "libavutil/tree.h" >> > +#include <fcntl.h> >> > +#if HAVE_IO_H >> > +#include <io.h> >> > +#endif >> > +#if HAVE_UNISTD_H >> > +#include <unistd.h> >> > +#endif >> > +#include "os_support.h" >> > +#include "url.h" >> > +#include <stdlib.h> >> > +#include <sys/stat.h> >> > + >> > +typedef struct IPFSGatewayContext { >> > + AVClass *class; >> > + URLContext *inner; >> > + char *gateway; >> > +} IPFSGatewayContext; >> > + >> > +// A best-effort way to find the IPFS gateway. >> > +// Only the most appropiate gateway is set. It's not actually >> > requested >> > +// (http call) to prevent a potential slowdown in startup. A >> > potential timeout >> > +// is handled by the HTTP protocol. >> > +static int populate_ipfs_gateway(URLContext *h, char *gateway) >> > +{ >> > + char ipfs_full_data_folder[PATH_MAX]; >> > + char ipfs_gateway_file[PATH_MAX]; >> > + struct stat st; >> > + int stat_ret = 0; >> > + int ret = AVERROR(EINVAL); >> > + FILE *gateway_file = NULL; >> > + char gateway_file_data[PATH_MAX]; >> > + >> > + // We could already have a gateway (it would have been passed as >> > -gateway). >> > + if (*gateway != '\0') { >> > + ret = 1; >> > + goto err; >> > + } >> >> Why even call this function if this is the case? >> >> > + >> > + // Test $IPFS_GATEWAY. >> > + if (getenv("IPFS_GATEWAY") != NULL) { >> > + snprintf(gateway, PATH_MAX, "%s", getenv("IPFS_GATEWAY")); >> >> Passing buffer length rather than assuming PATH_MAX would be better. >> That way it can be changed without this breaking. >> >> Or stick the buffer in IPFSGatewayContext and use sizeof() >> > > Awesome! Didn't even consider the option yet. Will do. > >> >> > + ret = 1; >> > + goto err; >> > + } else >> > + av_log(h, AV_LOG_DEBUG, "$IPFS_GATEWAY is empty.\n"); >> > + >> > + // We need to know the IPFS folder to - eventually - read the >> > contents of >> > + // the "gateway" file which would tell us the gateway to use. >> > + if (getenv("IPFS_PATH") == NULL) { >> > + av_log(h, AV_LOG_DEBUG, "$IPFS_PATH is empty.\n"); >> > + >> > + // Try via the home folder. >> > + if (getenv("HOME") == NULL) { >> > + av_log(h, AV_LOG_ERROR, "$HOME appears to be empty.\n"); >> > + ret = AVERROR(EINVAL); >> > + goto err; >> > + } >> > + >> > + // Verify the composed path fits. >> > + if (snprintf(ipfs_full_data_folder, PATH_MAX, "%s/.ipfs/", >> >> sizeof(ipfs_full_data_folder) >> This goes for most PATH_MAX in here >> > > Yeah, i kinda thought using PATH_MAX everywhere would be consistent. > Flipping all of those to sizeof(...) > >> >> > + getenv("HOME")) > PATH_MAX) { >> > + av_log(h, AV_LOG_ERROR, "The IPFS data path exceeds the >> > max path length (%i)\n", PATH_MAX); >> > + ret = AVERROR(EINVAL); >> > + goto err; >> > + } >> > + >> > + // Stat the folder. >> > + // It should exist in a default IPFS setup when run as local >> > user. >> > +#ifndef _WIN32 >> > + stat_ret = stat(ipfs_full_data_folder, &st); >> > +#else >> > + stat_ret = win32_stat(ipfs_full_data_folder, &st); >> > +#endif >> > + if (stat_ret < 0) { >> > + av_log(h, AV_LOG_INFO, "Unable to find IPFS folder. We >> > tried:\n"); >> > + av_log(h, AV_LOG_INFO, "- $IPFS_PATH, which was >> > empty.\n"); >> > + av_log(h, AV_LOG_INFO, "- $HOME/.ipfs (full uri: %s) >> > which doesn't exist.\n", ipfs_full_data_folder); >> > + ret = AVERROR(ENOENT); >> > + goto err; >> > + } >> >> Right. This makes more sense >> >> > + } else >> > + snprintf(ipfs_full_data_folder, PATH_MAX, "%s", >> > getenv("IPFS_PATH")); >> > + >> > + // Copy the fully composed gateway path into ipfs_gateway_file. >> > + if (snprintf(ipfs_gateway_file, PATH_MAX, "%sgateway", >> > + ipfs_full_data_folder) > PATH_MAX) { >> > + av_log(h, AV_LOG_ERROR, "The IPFS gateway file path exceeds >> > the max path length (%i)\n", PATH_MAX); >> > + ret = AVERROR(ENOENT); >> > + goto err; >> > + } >> > + >> > + // Get the contents of the gateway file. >> > + gateway_file = av_fopen_utf8(ipfs_gateway_file, "r"); >> > + if (!gateway_file) { >> > + av_log(h, AV_LOG_ERROR, "The IPFS gateway file (full uri: >> > %s) doesn't exist. Is the gateway enabled?\n", ipfs_gateway_file); >> > + ret = AVERROR(ENOENT); >> > + goto err; >> > + } >> > + >> > + // Read a single line (fgets stops at new line mark). >> > + fgets(gateway_file_data, PATH_MAX - 1, gateway_file); >> > + >> > + // Replace the last char with \0 >> > + gateway_file_data[PATH_MAX - 1] = 0; >> > + >> > + // Replace first occurence of end of line with \0 >> > + gateway_file_data[strcspn(gateway_file_data, "\r\n")] = 0; >> >> Won't this be just \n on Unix-like systems? Could be two lines, one >> with strcspn(.., "\n") and the other with "\r". >> > > The current logic works with linux. > Splitting it up regardless, it can't hurt and might catch some nasty edge > cases. > >> >> > + >> > + // If strlen finds anything longer then 0 characters then we >> > have a >> > + // potential gateway url. >> > + if (strlen(gateway_file_data) < 1) { >> > + av_log(h, AV_LOG_ERROR, "The IPFS gateway file (full uri: >> > %s) appears to be empty. Is the gateway started?\n", >> > ipfs_gateway_file); >> > + ret = AVERROR(EILSEQ); >> > + goto err; >> > + } >> > + >> > + // At this point gateway_file_data contains at least something. >> > + // Copy it into gateway. >> > + if (snprintf(gateway, PATH_MAX, "%s", gateway_file_data) > 0) { >> > + ret = 1; >> > + goto err; >> > + } else >> > + av_log(h, AV_LOG_DEBUG, "Unknown error in the IPFS gateway >> > file.\n"); >> > + >> > +err: >> > + if (gateway_file) >> > + fclose(gateway_file); >> > + >> > + return ret; >> > +} >> > + >> > +// For now just makes sure that the gateway ends in url we expect. >> > +// Like http://localhost:8080/. >> > +// Explicitly with the traling slash. >> > +static int sanitize_ipfs_gateway(URLContext *h, char *gateway) >> > +{ >> > + const char *url_without_protocol; >> > + int ret = 1; >> > + >> > + // Test if the gateway starts with either http:// or https:// >> > + // The remainder is stored in url_without_protocol >> > + if (av_stristart(gateway, "http://", &url_without_protocol) == 0 >> > + && av_stristart(gateway, "https://", &url_without_protocol) >> > == 0) { >> > + av_log(h, AV_LOG_ERROR, "The gateway URL didn't start with >> > http:// or https:// and is therefore invalid.\n"); >> > + ret = AVERROR(EILSEQ); >> > + goto err; >> > + } >> > + >> > + // We now know the remainder of the url without the protocol. >> > Check it for >> > + // some length. At least 1 character. >> > + if (strlen(url_without_protocol) < 1) { >> > + av_log(h, AV_LOG_ERROR, "The gateway url (without the >> > protocol part) is too short to be a valid URL.\n"); >> > + ret = AVERROR(EILSEQ); >> > + goto err; >> > + } >> >> I think we can just rely on the http protocol stuff to deal with this >> > > Do you mind if I keep it in? It doesn't seem to hurt and again does help > in debugging scenarios. > > >> >> > + >> > + >> > + if (gateway[strlen(gateway) - 1] != '/') { >> > + // Check if we have enough room to add a '/' >> > + // We already know that it's bigger then 0 because that's >> > handled >> > + // in populate_ipfs_gateway >> > + >> > + if (strlen(gateway) < (PATH_MAX - 2)) { >> > + // We have room. Get the length and add a '/' and a '\0' >> > + int gateway_length = strlen(gateway); >> > + gateway[gateway_length] = '/'; >> > + gateway[gateway_length + 1] = '\0'; >> > + } else >> > + av_log(h, AV_LOG_ERROR, "The gateway url is longer then >> > the allowed max path length (%i).\n", PATH_MAX); >> > + >> > + ret = 1; >> > + goto err; >> > + } >> >> Rolling this into the snprintf() would be prettier >> > > Ugh :P > Well, I did that before I changed it to this. It worked but it threw a > compile warning about the resulting size truncating the gateway variable. > This seemed simpler than checking for it. But then again, I am now > checking for that length in other places too so I might as well do that > here too. > > >> >> > + >> > +err: >> > + return ret; >> > +} >> > + >> > +static int translate_ipfs_to_http(URLContext *h, const char *uri, >> > + int flags, AVDictionary **options) >> > +{ >> > + const char *ipfs_cid; >> > + char *fulluri = NULL; >> > + char ipfs_gateway[PATH_MAX]; >> > + int ret; >> > + IPFSGatewayContext *c = h->priv_data; >> > + >> > + // Test for ipfs://, ipfs:, ipns:// and ipns:. This prefix is >> > stripped from >> > + // the string leaving just the CID in ipfs_cid. >> > + int is_ipfs = av_stristart(uri, "ipfs://", &ipfs_cid); >> > + int is_ipns = av_stristart(uri, "ipns://", &ipfs_cid); >> > + >> > + // We must have either ipns or ipfs. >> > + if (!is_ipfs && !is_ipns) { >> > + ret = AVERROR(EINVAL); >> > + av_log(h, AV_LOG_ERROR, "Unsupported url %s\n", uri); >> > + goto err; >> > + } >> >> Shouldn't this check that the URL is precisely what the protocol >> expects? ipfs for ff_ipfs_protocol and ipns for ff_ipns_protocol. Maybe >> that's validated further up however.. >> > > Well, the protocol has 2 potential ways: > <gateway>/<protocol>/<ipfs cid> (existed since the start of IPFS) > and > <ipfs cid>.<protocol>.<gateway> (called subdomain gateway) > > An example of both: > > http://localhost:8080/ipfs/bafybeigagd5nmnn2iys2f3doro7ydrevyr2mzarwidgadawmamiteydbzi > > http://bafybeigagd5nmnn2iys2f3doro7ydrevyr2mzarwidgadawmamiteydbzi.ipfs.localhost:8080 > > Both are valid but the former is easier. > The latter really mostly exists for website/browser purposes [1]. > Therefore I like to use the former because it's much easier. It only > requires a geteway provided by the user. > And after that it's only a matter of appending strings. The alternative > option is more tricky to get right. > Regardless, the way I concatenate the strings now is how I construct a > valid url. > > If that URL isn't valid then there's likely a gateway permission issue. > But that's outside the scope of this protocol and likely something the > http(s) protocol informs you about anyhow. > > Anyhow, thank you again for the review! > I'll send a V6 later this evening. > Are we close to merging it? ;) > > [1] > https://codeclimbing.com/why-i-am-excited-about-subdomain-ipfs-gateways/ > > Sorry for asking a perhaps stupid question but what is the normal sane way in c to append to a stack array? I now have this: if (c->gateway_buffer[strlen(c->gateway_buffer) - 1] != '/' && strlen(c->gateway_buffer) < (sizeof(c->gateway_buffer) - 1)) { // Add a "/". strncat also takes care of adding a null terminating character. strncat(c->gateway_buffer, "/", sizeof(c->gateway_buffer) - strlen(c->gateway_buffer) - 1); ret = 1; goto err; } Using strncat. That works without errors or warnings. But using snprintf to append keeps giving me these warnings that i don't get: libavformat/ipfsgateway.c:189:9: warning: ‘snprintf’ argument 4 overlaps destination object ‘c’ [-Wrestrict] 189 | if (snprintf(c->gateway_buffer, sizeof(c->gateway_buffer) - 50, "%s/", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 190 | c->gateway_buffer) > 0) { | ~~~~~~~~~~~~~~~~~~ libavformat/ipfsgateway.c:168:25: note: destination object referenced by ‘restrict’-qualified argument 1 was declared here 168 | IPFSGatewayContext *c = h->priv_data; | ^ It not an issue if the use of strncat is fine in the above example :) > >> /Tomas >> >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >> > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
prev parent reply other threads:[~2022-02-12 18:46 UTC|newest] Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-02-10 1:13 [FFmpeg-devel] [PATCH v5 0/1] " Mark Gaiser 2022-02-10 1:13 ` [FFmpeg-devel] [PATCH v5 1/1] avformat: " Mark Gaiser 2022-02-10 1:18 ` Mark Gaiser 2022-02-12 12:19 ` Tomas Härdin 2022-02-12 17:57 ` Mark Gaiser 2022-02-12 18:44 ` Mark Gaiser [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAPd6JnGhhgKAFgviudWx_tkK-0mU6B+vDsWAkUEDv7aQW22E6g@mail.gmail.com \ --to=markg85@gmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git