well i ended with lot of unsorted Minitokyo scans and the file names doesn't help
to tell witch author studio is belongs to
so I've written this script for a GNU system (linux ,BSD or any other unix clone for those who don't know
what's GNU is )
Code:
#!/bin/bash
mkdir special/
W3M="w3m -dump -F"
TRICKLE="trickle -d 1"
COMMAND="$TRICKLE $W3M"
COMMAND="$W3M"
ls *.jpg |sed '/[0-9]\{4,\}/d'|while read a ;do mv -v --backup=numbered "$a" special/ ;done
find -maxdepth 1 -type f -name 'Mini*' ! -name '*.jpg' -exec mv -v --backup=numbered "{}" special/ ';'
#ls Minitokyo*.jpg |sed 's/\(_\|\.\)/\t/g'|sed 's/^Minitokyo//'
#ls Minitokyo* |sed 's/\(_\|\.\)/\t/g'|sed 's/^Minitokyo//' |tr -c '[0-9\n]' ' ' |sed 's/^\ \+//'|egrep -e "^[0-9]{4,6}\W" |cut -f1 \
#ls Minitokyo* |sed 's/\(_\|\.\)/\ /g' |tr -c '[0-9\n]' ' ' |sed 's/^[0-9]\{0,3\}\W//g' |sed 's/^\ \+//'|egrep -e "[0-9]{4,6}\W" |sed 's/\ \+//g' \
ls *.jpg |sed 's/^.\+_//'|sed 's/\(_\|\.\)/\ /g' |tr -c '[0-9\n]' ' ' |sed 's/[^0-9][0-9]\{0,4\}[^0-9]//g'|sed 's/\([^0-9]\|^\)[0-9]\{0,3\}\([^0-9]\|$\)//g' \
|while read num jpg ;do
if [ "$num" != "" ];then
cat=$($COMMAND "http://gallery.minitokyo.net/view/$num" |grep -A1 "Official Creator:"|tail -n1|sed 's/^\ \+//'|sed 's/\(\ \|\/\)/_/g');
if [ ! $cat == '' ] ;then
echo "'$num' --> '$cat'" ;
mkdir -p $cat ;
# mv -v --backup=numbered *$num*.jpg $cat/;
# mv --backup=numbered *$num*.jpg $cat/;
mv -f *$num*.jpg $cat/;
else
echo "'$num' --> 'unknown'" ;
mkdir -p unknown ;
# mv -v --backup=numbered *$num*.jpg unknown/ ;
# mv --backup=numbered *$num*.jpg unknown/ ;
mv -f *$num*.jpg unknown/ ;
fi
fi
sleep 1
done
cd unknown/
echo "categorizing unkown with artist tag in the comment"
ls *.jpg|sed 's/^.\+_//' |sed 's/\(_\|\.\)/\ /g' |tr -c '[0-9\n]' ' ' |sed 's/[^0-9][0-9]\{0,4\}[^0-9]//g'|sed 's/\([^0-9]\|^\)[0-9]\{0,3\}\([^0-9]\|$\)//g' \
|while read num jpg ;do
if [ "$num" != "" ];then
cat=$($COMMAND "http://gallery.minitokyo.net/view/$num" |grep -i "artist:"|sed 's/^.\+://'|sed 's/^\ \+//'|sed 's/\(\ \|\/\)/_/g');
if [[ ! $cat == '' ]] ;then
echo "'$num' --> '$cat'" ;
mkdir -p "$cat" ;
# mv --backup=numbered *$num*.jpg "$cat"/;
mv -f *$num*.jpg "$cat"/;
else
echo "'$num' --> 'unknown'" ;
mkdir -p unknown ;
# mv -v --backup=numbered *$num*.jpg unknown/ ;
# mv --backup=numbered *$num*.jpg unknown/ ;
mv -f *$num*.jpg unknown/ ;
fi
fi
sleep 1
done
cd ..
filenames witch have not conformed the Minitokyo naming standard (renamed by user) are put under "special" folder
files witch doesn't have a proper studio/original creator tag are put under unknown
in the unknown folder the script try to figure out the author from the submitter comment or other users post but for the
time being it expect them to write
"author:" so it doesn't work always
the script isn't user friendly (full of commented out code)
so be cautious about the loss of files if something goes wrong ( who knows )
i plan to add some Exif in file tags but I'm not familiar with that (Mp3 tags are easier to manipulate and
universally accepted)
so drop her any suggestions you got ;)
merged: 06-25-2009 ~ 02:04pm
I forget to tell that if you want to reduce the bandwidth used by this script u can comment the second COMMAND= line but
u need to install trickled to have that work