问题

老高最近遇到一个需求,linux\centos下,使用selenium技术抓取数据。本来很简单的问题,但是由于内存限制,安装X window不现实,所以一个BT的想法诞生了,是否可以在centos命令行界面运行一个虚拟的桌面,然后使用selenium控制Firefox浏览器完成一些操作,Firefox运行在虚拟的桌面中,一切操作都在命令行中完成。

Google之,发现了Xvfb,他可以新建一个虚拟的X窗口,再配合python的pyvirtualdisplay,简直就是神器!

安装

centos下:

# 安装Xvfb和pyvirtualdisplay
yum install xorg-x11-server-Xvfb
pip install pyvirtualdisplay

安装firefox和selenium

yum install firefox
pip install selenium

代码

from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=0, size=(800, 600))
display.start()

browser = webdriver.Firefox()
browser.get('http://www.google.com')
print browser.title
browser.quit()

display.stop()

参考网站:

http://selenium-python.readthedocs.org/en/latest/getting-started.html http://nullege.com/codes/search/selenium.webdriver.Remote.find_elements_by_class_name http://www.opsview.com/forum/opsview-core/how-do-i/how-do-i-install-selenium-centos-server https://gist.github.com/textarcana/5855427 http://scraping.pro/use-headless-firefox-scraping-linux/ http://serverfault.com/questions/363827/how-can-i-run-firefox-on-centos-with-no-display https://realpython.com/blog/python/headless-selenium-testing-with-python-and-phantomjs/ https://pypi.python.org/pypi/selenium

http://selenium.googlecode.com/git/docs/api/py/selenium/selenium.selenium.html#module-selenium.selenium

http://www.ibm.com/developerworks/cn/opensource/os-php-designptrns/ http://www.cnblogs.com/fnng/p/3230768.html http://www.cnblogs.com/fnng/p/3157639.html http://www.cnblogs.com/fnng/p/3157639.html

主键

-- 为当前表添加主键
ALTER TABLE `tablename`
	ADD COLUMN id TINYINT UNSIGNED NOT NULL AUTO_INCREMENT,
	ADD PRIMARY KEY (id);

-- 删除主键

ALTER TABLE `tablename`
	DROP PRIMARY KEY;

创建数据库

# utf8mb4_unicode_ci更准
CREATE DATABASE IF NOT EXISTS typecho DEFAULT CHARSET utf8mb4 COLLATE utf8mb4_unicode_ci;
# utf8mb4_general_ci更快
CREATE DATABASE IF NOT EXISTS typecho DEFAULT CHARSET utf8mb4 COLLATE utf8mb4_general_ci;
CREATE DATABASE typecho DEFAULT CHARACTER SET gbk COLLATE gbk_chinese_ci;

创建用户并提供相应权限

# 只是创建用户
CREATE USER phpergao@'localhost' IDENTIFIED BY 'yourpasswd';

# 赋予权限
GRANT select,update on phpergao.* to phpergao@'localhost';

GRANT index ON phpergao.* TO phpergao@'192.168.0.%';

# 创建用户并赋予权限
GRANT ALL PRIVILEGES ON phpergao.* TO 'phpergao'@'localhost' IDENTIFIED BY 'yourpasswd';

# 相反的revoke 跟 grant 的语法差不多,只需要把关键字 “to” 换成 “from” 即可:
REVOKE ALL PRIVILEGES ON phpergao.* FROM 'phpergao'@'localhost';

# ALL PRIVILEGES 可以换为select,insert,update,delete,create,drop,index,alter,grant,references,reload,shutdown,process,file等14个权限。

# 删除用户
DELETE FROM user WHERE User='phpergao' and Host='localhost';

# 修改用户密码
UPDATE USER SET PASSWORD = PASSWORD ('newpasswd') WHERE	USER = 'phpergao' AND HOST = 'localhost';

刷新权限

FLUSH PRIVILEGES;

查看用户权限

# 查看自己的权限
SHOW GRANTS;
# 查看其他人的权限
SHOW GRANTS FOR 'phpergao'@'%';

新建数据表

DROP TABLE IF EXISTS `workers_info`;  
CREATE TABLE `workers_info` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `workername` varchar(20) NOT NULL,
  `sex` enum('F','M','S') DEFAULT 'S',
  `salary` int(11) DEFAULT '0',
  `email` varchar(30) DEFAULT NULL,
  `EmployedDates` date DEFAULT NULL,
  `department` varchar(30) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;

忘记了mysql密码

修改配置文件

[mysqld] 
datadir=/var/lib/mysql 
socket=/var/lib/mysql/mysql.sock 
# ADD
skip-name-resolve 
skip-grant-tables

然后重启MySQL服务并免密码登录

service mysqld restart
mysql

执行修改密码SQL

将root用户的密码统一设为'admin'
UPDATE mysql.user SET Password=password('123456') WHERE User='root';

还原MySQL配置文件并重启服务

修改用户登录HOST

UPDATE mysql.user SET Host='&' WHERE User='root';

参考:

http://renxiangzyq.iteye.com/blog/763837

服务器为了安全设置,使用普通用户登陆,再su至root。

而su以后超过一定时间会超时退出到普通用户,带来了一定的麻烦。

解决办法:

OS:CENTOS 6

vi /etc/profile

# 注释
#TMOUT=300

sudo执行的第一次需要验证密码,之后一段时间不需要输入就可以执行命令,控制超时的方法:

sudo visudo

# 翻到60-70行,类似

Defaults    env_reset

#改为,30000指的是超时时间是30000min,请合理设置

Defaults    env_reset,timestamp_timeout=30000

主服务器(master)IP:192.168.0.1 从服务器(slave)IP:192.168.0.2 首先确保主从服务器上的Mysql版本相同

主服务器上操作

创建用户名为repl的一个账户

GRANT REPLICATION SLAVE ON *.* TO 'repl'@'192.168.0.2' IDENTIFIED BY 'xxxxxxxxx';

修改主数据库的配置文件my.cnf,开启BINLOG,并设置server-id的值,修改之后必须重启Mysql服务,如果不需要修改可不用重启。

server-id=1
log_bin = /usr/llocal/mysql/log/mysql-bin.log

之后可以得到主服务器当前二进制日志名和偏移量,这个操作的目的是为了在从数据库启动后,从这个点开始进行数据的恢复

flush tables with read lock;  这是session级,退出就隐式 unlock tables;
show master status;

生成主数据库的备份 如果mysqldump 无法识别,则在/home/mysql/.bash_profile 添加环境变量 export PATH=$PATH:/usr/local/mysql/bin mysqldump -p3306 -uroot –pxxxxxxxx test > test.sql unlock tables; 将备份出来的数据复制到从数据库

Scp test.sql 192.168.0.2:

从服务器上操作

将备份数据导入数据库

Mysql –uroot –pxxxxxxxx  test < test.sql

修改从数据库的my.cnf,增加server-id参数,如有更改需要重启

server-id=2 # 注:一定不能跟主数据库一样

指定复制使用的用户,主数据库服务器的ip,端口以及开始执行复制日志的文件和位置

CHANGE MASTER TO MASTER_HOST='192.168.1.130', MASTER_USER='repl', MASTER_PASSWORD='456123', MASTER_LOG_FILE='mysql-bin.xxxx', MASTER_LOG_POS=xxxx;

MASTER_LOG_FILE=' mysql-bin.xxxx', MASTER_LOG_POS=xxxx 这俩个参数参见主服务器 show master status

启动slave进程

Start slave;

查看从服务器状态

Show slave status;
Slave_IO_Running: Yes //此状态必须YES
Slave_SQL_Running: Yes //此状态必须YES

测试

在主数据库上插入一条数据,然后看从数据库是否有更新。然后就可以自己发挥了。总的来说就是一般用主从复制(Master-Slave)的方式来同步数据,再通过读写分离(MySQL-Proxy)来提升数据库的并发负载能力,再通过高可用性(High Availability)确保服务的稳定。

今天用svn命令行提交版本的时候,碰到了这个比较麻烦的问题

svn: File already exists: filesystem 'xxx/svn/xxx/db'

搜了一下解决办法,都是需要两次commit,太麻烦。

直接在提交根目录执行以下命令

svn update path/ --accept=mine-full

一句话解决!

转自:http://java.dzone.com/articles/useful-subversion-pre-commit

  1. Checks whether the commit message is not empty
  2. Checks whether the commit message consists of at least 5 characters
  3. Checks if the committed files are UTF-8 compliant
  4. Checks whether the svn:eol-style property is set to LF on newly added files
  5. Checks if the committed files have no TAB characters

The UTF-8 and TAB checks are performed on the following file suffixes

  • *.java
  • *.js
  • *.xhtml
  • *.css
  • *.xml
  • *.properties (only check for TABs here, no check for UTF-8 compliance)

翻译一下:

  1. 检查提交日志是否为空
  2. 检查提交日志最少需要N个字符
  3. 检查提交文件是否是UTF-8格式
  4. 检查新文件的换行模式是否为LF
  5. 检查提交的文件是否含有TABs换行符

检查UTF-8编码和TABs换行符只针对以下后缀文件:

  • *.java
  • *.js
  • *.xhtml
  • *.css
  • *.xml
  • *.properties (只检查TABs,不检查UTF-8)

以下是代码

注意:针对Linux

#!/bin/bash

REPOS="$1"
TXN="$2"


# Make sure that the log message contains some text.
SVNLOOK=/usr/bin/svnlook
ICONV=/usr/bin/iconv

SVNLOOKOK=1
LOGMSG=`$SVNLOOK log -t "$TXN" "$REPOS" | wc -c` 
if [ "$LOGMSG" -lt 2 ];
then
   echo -e "\t That logmessage contains at least 2 alphanumeric characters. Commit aborted!" 1>&2
  exit 1
fi


# Make sure that all files to be committed are encoded in UTF-8.
while read changeline; 
do

  # Get just the file (not the add / update / etc. status).
  file=${changeline:4}

  # Only check source files.
  if [[ $file == *.java || $file == *.xhtml || $file == *.css || $file == *.xml || $file == *.js ]] ; then
    $SVNLOOK cat -t "$TXN" "$REPOS" "$file" | $ICONV -f UTF-8 -t UTF-8 -o /dev/null
    if [ "${PIPESTATUS[1]}" != 0 ] ; then
      echo "Only UTF-8 files can be committed ("$file")" 1>&2
      exit 1
    fi
  fi
done < <($SVNLOOK changed -t "$TXN" "$REPOS")

# Check files for svn:eol-style property
# Exit on all errors.
set -e
EOL_STYLE="LF"
echo "`$SVNLOOK changed -t "$TXN" "$REPOS"`" | while read REPOS_PATH
do
 if [[ $REPOS_PATH =~ A[[:blank:]]{3}(.*)\.(java|css|properties|xhtml|xml|js) ]]
 then
  if [ ${#BASH_REMATCH[*]} -ge 2 ]
    then
  FILENAME=${BASH_REMATCH[1]}.${BASH_REMATCH[2]};

  # Make sure every file has the right svn:eol-style property set
   if [ $EOL_STYLE != "`$SVNLOOK propget -t \"$TXN\" \"$REPOS\" svn:eol-style \"$FILENAME\" 2> /dev/null`" ]
    then
    ERROR=1;
      echo "svn ps svn:eol-style $EOL_STYLE \"$FILENAME\"" >&2
   fi
  fi
 fi
 test -z $ERROR || (echo "Please execute above commands to correct svn property settings. EOL Style LF must be used!" >& 2; exit 1)
done



# Block commits with tabs
# This is coded in python
# Exit on all errors
set -e

$SVNLOOK diff -t "$TXN" "$REPOS" | python /dev/fd/3 3<<'EOF'
import sys
ignore = True
SUFFIXES = [ ".java", ".css", ".xhtml", ".js", ".xml", ".properties" ]
filename = None

for ln in sys.stdin:

    if ignore and ln.startswith("+++ "):
        filename = ln[4:ln.find("\t")].strip()
        ignore = not reduce(lambda x, y: x or y, map(lambda x: filename.endswith(x), SUFFIXES))

    elif not ignore:
        if ln.startswith("+"):
        
           if ln.count("\t") > 0:
              sys.stderr.write("\n*** Transaction blocked, %s contains tab character:\n\n%s" % (filename, ln))
              sys.exit(1)

        if not (ln.startswith("@") or \
           ln.startswith("-") or \
           ln.startswith("+") or \
           ln.startswith(" ")):

           ignore = True

sys.exit(0)
EOF

# All checks passed, so allow the commit.
exit 0

如何使用

  1. 重命名hooks文件夹下pre-commit.tmplpre-commit
  2. 修改文件内容
  3. 将文件变为可执行 chmod +x pre-commit

提交完代码自动更新

vim hooks/post-commit

#!/bin/sh
export LANG=zh_CN.UTF-8  # 设置UTF-8编码
SVN=/usr/bin/svn         # 这里配置的是svn安装bin目录下的svn文件
WEB=/var/www/html      # 要更新的目录
$SVN update $WEB --username xxxx --password xxxx