Discussion:
st_size of a symlink
(too old to reply)
Richard Weinberger
2012-07-23 16:00:02 UTC
Permalink
Hi!

lstat(2) on /proc/$pid/exe gives me a stat object where st_size is 0.

Or:
***@mantary:~> ls -l /proc/$$/exe
lrwxrwxrwx 1 rw users 0 23. Jul 17:02 /proc/16902/exe -> /bin/bash

The lstat(2) manpage says:
"The st_size field gives the size of the file (if it is a regular file
or a symbolic link) in bytes. The size of a symbolic link is the length
of the pathname it contains, without a terminating null byte."

This property is also used in the example in the readlink(2) manpage.

Is this a procfs issue or is the manpage wrong?

Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Jesper Juhl
2012-07-23 18:10:02 UTC
Permalink
Post by Richard Weinberger
Hi!
lstat(2) on /proc/$pid/exe gives me a stat object where st_size is 0.
lrwxrwxrwx 1 rw users 0 23. Jul 17:02 /proc/16902/exe -> /bin/bash
"The st_size field gives the size of the file (if it is a regular file or a
symbolic link) in bytes. The size of a symbolic link is the length of the
pathname it contains, without a terminating null byte."
This property is also used in the example in the readlink(2) manpage.
Is this a procfs issue or is the manpage wrong?
I have relied on that behaviour (size of symlink being lengh of pathname
it contains) in the past, so I know it used to work and I'd expect it to
keep working.
I honestly never really thought about procfs, but checking now, it does
seem that procfs doesn't quite do things right...

Just so we all know what kernel I'm running:
[***@arch tmp]# uname -a
Linux arch 3.4.6-1-ARCH #1 SMP PREEMPT Fri Jul 20 08:21:26 CEST 2012 x86_64 GNU/Linux

Let's see what procfs reports:
[***@arch ~]# ls -l /proc/$$/exe
lrwxrwxrwx 1 root root 0 Jul 23 19:58 /proc/884/exe -> /bin/bash
Doesn't seem quite right....

Now let's see what tmpfs reports:
[***@arch tmp]# mount | grep /tmp
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,relatime)
[***@arch ~]# cd /tmp
[***@arch tmp]# ln -s /tmp foo
[***@arch tmp]# ls -l foo
lrwxrwxrwx 1 root root 4 Jul 23 19:59 foo -> /tmp
Seems OK.

Let's check ext4:
[***@arch tmp]# mount | grep /home
/dev/sda4 on /home type ext4 (rw,relatime,data=ordered)
[***@arch tmp]# cd /home/jj
[***@arch jj]# touch foo
[***@arch jj]# ln -s foo bar
[***@arch jj]# ls -l bar
lrwxrwxrwx 1 root root 3 Jul 23 20:03 bar -> foo
Seems OK as well..

So how about devtmpfs?
[***@arch jj]# mount | grep devtmpfs
dev on /dev type devtmpfs (rw,nosuid,relatime,size=779400k,nr_inodes=194850,mode=755)
[***@arch jj]# ls -l /dev/stderr
lrwxrwxrwx 1 root root 15 Jul 23 19:46 /dev/stderr -> /proc/self/fd/2
Also looks OK...

So, from my point of view it looks like procfs is the one who has got it
wrong.
We should probably fix that (IMVHO).
--
Jesper Juhl <***@chaosbits.net> http://www.chaosbits.net/
Don't top-post http://www.catb.org/jargon/html/T/top-post.html
Plain text mails only, please.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Al Viro
2012-07-23 20:30:02 UTC
Permalink
Post by Jesper Juhl
So, from my point of view it looks like procfs is the one who has got it
wrong.
We should probably fix that (IMVHO).
Fix it _how_? Try to rename a binary you have running in a process.
Or rename its cwd. Or rename an opened file. Watch the corresponding
procfs symlink (still pointing to the swame object) change. With
no way to tell that some sucker had looked at st_size some time ago
and might get surprised by the change.

The fact is, st_size is just a useful hint for symlink target length.
It tells you the likely sufficient size of buffer. There's a reason
why readlink(2) returns what it returns; you *can't* rely on the
earlier lstat() results or, for that matter, any prior information.
If nothing else, I could rm that symlink and create a new one in
the meanwhile. You need to check what it had returned and deal with
insufficient buffer size. By retrying readlink() with bigger buffer.
With procfs there's just a few more ways the readlink() output can
change, that's all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Jesper Juhl
2012-07-23 20:50:02 UTC
Permalink
Post by Al Viro
Post by Jesper Juhl
So, from my point of view it looks like procfs is the one who has got it
wrong.
We should probably fix that (IMVHO).
Fix it _how_?
By returning the size as the number of bytes in the name the link is
currently pointing at.
Post by Al Viro
Try to rename a binary you have running in a process.
Or rename its cwd. Or rename an opened file. Watch the corresponding
procfs symlink (still pointing to the swame object) change. With
no way to tell that some sucker had looked at st_size some time ago
and might get surprised by the change.
Sure, length's may change and an app needs to be prepared for that, but
that's no reason to always return 0 (zero) for length for links in procfs.
We can do better and return the actual length of whatever the link is
pointing to currently - just like other filesystems do.
Post by Al Viro
The fact is, st_size is just a useful hint for symlink target length.
Sure.
Post by Al Viro
It tells you the likely sufficient size of buffer. There's a reason
why readlink(2) returns what it returns; you *can't* rely on the
earlier lstat() results or, for that matter, any prior information.
I know that. That's not the issue. The issue is that procfs *could* return
more useful info than it does currently.
Post by Al Viro
If nothing else, I could rm that symlink and create a new one in
the meanwhile. You need to check what it had returned and deal with
insufficient buffer size.
Of course.
Post by Al Viro
By retrying readlink() with bigger buffer.
With procfs there's just a few more ways the readlink() output can
change, that's all.
Still not a good reason to just return 0 IMHO.
--
Jesper Juhl <***@chaosbits.net> http://www.chaosbits.net/
Don't top-post http://www.catb.org/jargon/html/T/top-post.html
Plain text mails only, please.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Weinberger
2012-07-23 22:10:02 UTC
Permalink
Post by Jesper Juhl
Post by Al Viro
Fix it _how_?
By returning the size as the number of bytes in the name the link is
currently pointing at.
This is not easy.
procfs has no clue where the link pointing at.
The information is generated while accessing the link.
tmpfs on the other hand has this information because symlinks get only
changed through tmpfs...
Post by Jesper Juhl
Post by Al Viro
By retrying readlink() with bigger buffer.
With procfs there's just a few more ways the readlink() output can
change, that's all.
Still not a good reason to just return 0 IMHO.
IMHO the lstat() and readlink() manpages have to be more precise about
st_size.

Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Guillem Jover
2012-07-23 23:20:02 UTC
Permalink
Post by Richard Weinberger
Post by Jesper Juhl
Post by Al Viro
Fix it _how_?
By returning the size as the number of bytes in the name the link is
currently pointing at.
This is not easy.
procfs has no clue where the link pointing at.
The information is generated while accessing the link.
tmpfs on the other hand has this information because symlinks get
only changed through tmpfs...
Well, can't the link be accessed when getting the stat information
then?
Post by Richard Weinberger
Post by Jesper Juhl
Post by Al Viro
By retrying readlink() with bigger buffer.
With procfs there's just a few more ways the readlink() output can
change, that's all.
Still not a good reason to just return 0 IMHO.
IMHO the lstat() and readlink() manpages have to be more precise
about st_size.
They document what POSIX says:

<http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html>

regards,
guillem
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Weinberger
2012-07-24 10:20:03 UTC
Permalink
Post by Guillem Jover
Well, can't the link be accessed when getting the stat information
then?
Sure, but it's not trivial.
Post by Guillem Jover
Post by Richard Weinberger
IMHO the lstat() and readlink() manpages have to be more precise
about st_size.
This does not make it any better...

Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Loading...