From: "Martin Storsjö" <martin@martin.st> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH v17 1/5] libavutil: Add wchartoutf8(), wchartoansi(), utf8toansi() and getenv_utf8() Date: Sun, 19 Jun 2022 09:33:52 +0300 (EEST) Message-ID: <56bf886d-8dcc-b04d-32fd-246d1869ff5@martin.st> (raw) In-Reply-To: <DM8P223MB03655B2390A8B6C66CB81D65BAB19@DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM> On Sun, 19 Jun 2022, Soft Works wrote: > > >> -----Original Message----- >> From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> On Behalf Of >> Andreas Rheinhardt >> Sent: Sunday, June 19, 2022 6:59 AM >> To: ffmpeg-devel@ffmpeg.org >> Subject: Re: [FFmpeg-devel] [PATCH v17 1/5] libavutil: Add >> wchartoutf8(), wchartoansi(), utf8toansi() and getenv_utf8() >> >> Nil Admirari: >>> wchartoutf8() converts strings returned by WinAPI into UTF-8, >>> which is FFmpeg's preffered encoding. >>> >>> Some external dependencies, such as AviSynth, are still >>> not Unicode-enabled. utf8toansi() converts UTF-8 strings >>> into ANSI in two steps: UTF-8 -> wchar_t -> ANSI. >>> wchartoansi() is responsible for the second step of the conversion. >>> Conversion in just one step is not supported by WinAPI. >>> >>> Since these character converting functions allocate the buffer >>> of necessary size, they also facilitate the removal of MAX_PATH >> limit >>> in places where fixed-size ANSI/WCHAR strings were used >>> as filename buffers. >>> >>> getenv_utf8() wraps _wgetenv() converting its input from >>> and its output to UTF-8. Compared to plain getenv(), >>> getenv_utf8() requires a cleanup. >>> >>> Because of that, in places that only test the existence of >>> an environment variable or compare its value with a string >>> consisting entirely of ASCII characters, the use of plain getenv() >>> is still preferred. (libavutil/log.c check_color_terminal() >>> is an example of such a place.) >>> >>> Plain getenv() is also preffered in UNIX-only code, >>> such as bktr.c, fbdev_common.c, oss.c in libavdevice >>> or af_ladspa.c in libavfilter. >>> --- >>> configure | 1 + >>> libavutil/getenv_utf8.h | 71 >> ++++++++++++++++++++++++++++++++++++++ >>> libavutil/wchar_filename.h | 51 +++++++++++++++++++++++++++ >>> 3 files changed, 123 insertions(+) >>> create mode 100644 libavutil/getenv_utf8.h >>> >>> diff --git a/configure b/configure >>> index 3dca1c4bd3..fa37a74531 100755 >>> --- a/configure >>> +++ b/configure >>> @@ -2272,6 +2272,7 @@ SYSTEM_FUNCS=" >>> fcntl >>> getaddrinfo >>> getauxval >>> + getenv >>> gethrtime >>> getopt >>> GetModuleHandle >>> diff --git a/libavutil/getenv_utf8.h b/libavutil/getenv_utf8.h >>> new file mode 100644 >>> index 0000000000..161e3e6202 >>> --- /dev/null >>> +++ b/libavutil/getenv_utf8.h >>> @@ -0,0 +1,71 @@ >>> +/* >>> + * This file is part of FFmpeg. >>> + * >>> + * FFmpeg is free software; you can redistribute it and/or >>> + * modify it under the terms of the GNU Lesser General Public >>> + * License as published by the Free Software Foundation; either >>> + * version 2.1 of the License, or (at your option) any later >> version. >>> + * >>> + * FFmpeg is distributed in the hope that it will be useful, >>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of >>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >> GNU >>> + * Lesser General Public License for more details. >>> + * >>> + * You should have received a copy of the GNU Lesser General >> Public >>> + * License along with FFmpeg; if not, write to the Free Software >>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA >> 02110-1301 USA >>> + */ >>> + >>> +#ifndef AVUTIL_GETENV_UTF8_H >>> +#define AVUTIL_GETENV_UTF8_H >>> + >>> +#include <stdlib.h> >>> + >>> +#include "mem.h" >>> + >>> +#ifdef HAVE_GETENV >>> + >>> +#ifdef _WIN32 >>> + >>> +#include "libavutil/wchar_filename.h" >>> + >>> +static inline char *getenv_utf8(const char *varname) >>> +{ >>> + wchar_t *varname_w, *var_w; >>> + char *var; >>> + >>> + if (utf8towchar(varname, &varname_w)) >>> + return NULL; >>> + if (!varname_w) >>> + return NULL; >>> + >>> + var_w = _wgetenv(varname_w); >>> + av_free(varname_w); >>> + >>> + if (!var_w) >>> + return NULL; >>> + if (wchartoutf8(var_w, &var)) >>> + return NULL; >>> + >>> + return var; >>> + >>> + // No CP_ACP fallback compared to other *_utf8() functions: >>> + // non UTF-8 strings must not be returned. >>> +} >>> + >>> +#else >>> + >>> +static inline char *getenv_utf8(const char *varname) >>> +{ >>> + return av_strdup(getenv(varname)); >> >> This forces allocations and frees in scenarios where this is wholly >> unnecessary. > > Why do you think this is unnecessary? At least on Windows, there is > no guarantee regarding the lifetime of strings returned from > getenv(). In case when some other code would call _putenv to set the > env variable, this can cause the previously returned string to become > invalid without the caller being able to know. Yes, if you would keep the return value from getenv for too long, while something else changes the environment in the same process, you'd have such an issue. But that hasn't been a concern so far - right? And isn't what we try to fix here. >> This can be avoided by adding a custom deallocator for >> strings returned via getenv_utf8: Namely a define/wrapper around >> av_free in the _WIN32 and a no-op else. > > I don't think I really understand what you mean, by the above 😉 He means that we could add a getenv_utf8_free(), which would be a noop on unix, and keeping getenv_utf8() just returning plain getenv() on unix. Then on Windows we'd do the extra alloc/free. That sounds doable to me. There's a bigger risk of forgetting to free the getenv_utf8 output (when tools like valgrind and asan wouldn't notice it in unix), but that risk is probably acceptable. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2022-06-19 6:34 UTC|newest] Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-06-17 9:31 Nil Admirari 2022-06-17 9:31 ` [FFmpeg-devel] [PATCH v17 2/5] compat/w32dlfcn.h: Remove MAX_PATH limit and replace LoadLibraryExA with LoadLibraryExW Nil Admirari 2022-06-17 9:31 ` [FFmpeg-devel] [PATCH v17 3/5] fftools: Remove MAX_PATH limit and switch to UTF-8 versions of fopen() and getenv() Nil Admirari 2022-06-17 9:31 ` [FFmpeg-devel] [PATCH v17 4/5] libavformat: Remove MAX_PATH limit and use UTF-8 version of getenv() Nil Admirari 2022-06-18 22:24 ` Martin Storsjö 2022-06-18 22:36 ` Martin Storsjö 2022-06-18 22:48 ` Soft Works 2022-06-19 7:49 ` Martin Storsjö 2022-06-19 8:00 ` Soft Works 2022-06-19 11:44 ` nil-admirari 2022-06-18 22:40 ` Martin Storsjö 2022-06-19 11:47 ` nil-admirari 2022-06-17 9:31 ` [FFmpeg-devel] [PATCH v17 5/5] libavfilter/vf_frei0r.c: Use " Nil Admirari 2022-06-17 19:16 ` [FFmpeg-devel] [PATCH v17 1/5] libavutil: Add wchartoutf8(), wchartoansi(), utf8toansi() and getenv_utf8() Soft Works 2022-06-18 22:21 ` Martin Storsjö 2022-06-19 11:49 ` nil-admirari 2022-06-19 4:58 ` Andreas Rheinhardt 2022-06-19 5:56 ` Soft Works 2022-06-19 6:27 ` Andreas Rheinhardt 2022-06-19 7:24 ` Soft Works 2022-06-19 6:33 ` Martin Storsjö [this message] 2022-06-19 6:43 ` Andreas Rheinhardt 2022-06-19 11:56 ` nil-admirari 2022-06-20 0:54 ` Andreas Rheinhardt 2022-06-20 10:36 ` nil-admirari
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=56bf886d-8dcc-b04d-32fd-246d1869ff5@martin.st \ --to=martin@martin.st \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git