字数 17554
1 引文
本篇主要内容:
-
在什么场景下需要代理
-
java中如何实现代理,有几种方法,含网络产品示例
2 什么场景下需要代理
网络环境是多样性的,比如我在[这里](/posts/tiaoban_https/#2-2-%E5%86%85%E7%BD%91%E4%BA%91%E4%B8%BB%E6%9C%BA%E5%A6%82%E4%BD%95 %E8%AE%BF%E9%97%AE%E5%A4%96%E7%BD%91)介绍的场景,程序部署在内网云主机中,需要通过代理去访问外部接口就是一种很典型的场景。 这就要求在写API SDK时要提供一个通用的访问API的方案,包括内网走代理的情况之下。
3 我是如何实现java代理的
一般情况,我们在内网主机上要配置正向代理,比如vim /etc/profile
:
export HTTP_PROXY=http://172.26.3.141:3128/
export HTTPS_PROXY=https://172.26.3.141:3128/
这个是设置环境变量的
这样用curl等命令是可以访问外网了,但是java程序不行,为什么不行,我觉得是你建立TCP套接字时,跟谁去建立。curl读的环境变量里的HTTP_PROXY内容, 所以可以,而java程序,不做特殊处理的情况下自然是直接跟目标网站去建立TCP连接了,可能吗?不能。
一般情况,我们写http请求是这样的,这个就是直连:
URL url = new URL(surl);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
于是,在上面两行代码的前面,我们可以尝试这样的方式去给java加代理,就可以实现访问了:
public void addProxy() {
//另一种方法 https://ask.csdn.net/questions/379970
// https://yq.aliyun.com/articles/294
// https://blog.csdn.net/qincidong/article/details/82454427
// HttpHost proxy = new HttpHost(Application.PROXY_HOST, Application.PROXY_PORT);
// RequestConfig requestConfig = RequestConfig.custom().setProxy(proxy)
// .setSocketTimeout(socketTimeout)
// .setConnectTimeout(connectTimeout).build();
//或需要使用squid代理
//https://blog.csdn.net/redhat456/article/details/6149774/
//https://blog.csdn.net/kfanning/article/details/5481552
//https://stackoverflow.com/questions/1432961/how-do-i-make-httpurlconnection-use-a-proxy
//https://community.oracle.com/thread/1691437
System.setProperty("java.net.useSystemProxies","true");
String hostname=new String();
String port=new String();
//https://blog.csdn.net/lirx_tech/article/details/51005281,应该指定为列表?
//String testUrl="https://172.26.3.141:3128/";
//https://blog.csdn.net/qq_29951983/article/details/79230671
//https://www.cnblogs.com/littleatp/p/4729781.html
MyProxySelector myProxySelector = new MyProxySelector();
ProxySelector.setDefault(myProxySelector);
List l = ProxySelector.getDefault().select(null);
for (Iterator iter = l.iterator(); iter.hasNext(); ) {
Proxy proxy = (Proxy) iter.next();
log.debug("!!!proxy hostname : " + proxy.type());
InetSocketAddress addr = (InetSocketAddress) proxy.address();
if (addr == null) {
log.debug("!!!No Proxy");
} else {
log.debug("!!!proxy hostname : " + addr.getHostName());
hostname = addr.getHostName();
log.debug("!!!proxy port : " + addr.getPort());
port = String.valueOf(addr.getPort());
}
}
Properties systemProperties = System.getProperties();
systemProperties.setProperty("http.proxyHost", hostname);
systemProperties.setProperty("http.proxyPort", port);
systemProperties.setProperty("https.proxyHost", hostname);
systemProperties.setProperty("https.proxyPort", port);
}
这个system的property我理解就是虚拟机认的一个东西,加载虚拟机或在程序代码里设定。然后URL
默认情况下认这个东东。
这句是我结合下面的实验而想当然的推论,没功夫去深究下去了。
import org.slf4j.LoggerFactory;
import java.io.IOException;
import java.net.*;
import java.util.ArrayList;
import java.util.List;
/**
* 生产环境用的是跳板机,程序部署在内网主机,需要代理到公网主机上才能访问外部HTTP/HTTPS接口
* @author tao
*
*/
public class MyProxySelector extends ProxySelector{
final static org.slf4j.Logger log = LoggerFactory.getLogger(MyProxySelector.class);
String PROXY_ADDR = "172.26.3.141";
int PROXY_PORT = 3128;
/**
* Selects all the applicable proxies based on the protocol to
* access the resource with and a destination address to access
* the resource at.
* The format of the URI is defined as follow:
* <UL>
* <LI>http URI for http connections</LI>
* <LI>https URI for https connections
* <LI>{@code socket://host:port}<br>
* for tcp client sockets connections</LI>
* </UL>
*
* @param uri The URI that a connection is required to
* @return a List of Proxies. Each element in the
* the List is of type
* {@link Proxy Proxy};
* when no proxy is available, the list will
* contain one element of type
* {@link Proxy Proxy}
* that represents a direct connection.
* @throws IllegalArgumentException if the argument is null
*/
@Override
public List<Proxy> select(URI uri) {
List<Proxy> list = new ArrayList<Proxy>();
list.add(new Proxy(Proxy.Type.HTTP, new InetSocketAddress(PROXY_ADDR, PROXY_PORT)));
return list;
}
/**
* Called to indicate that a connection could not be established
* to a proxy/socks server. An implementation of this method can
* temporarily remove the proxies or reorder the sequence of
* proxies returned by {@link #select(URI)}, using the address
* and the IOException caught when trying to connect.
*
* @param uri The URI that the proxy at sa failed to serve.
* @param sa The socket address of the proxy/SOCKS server
* @param ioe The I/O exception thrown when the connect failed.
* @throws IllegalArgumentException if either argument is null
*/
@Override
public void connectFailed(URI uri, SocketAddress sa, IOException ioe) {
log.debug("!!!代理服务器{}:{}连接不上", PROXY_ADDR, PROXY_PORT);
}
}
这样,参考了一大堆的网页,最后整出一个并不灵活的方案,其实最开始是想灵活化的,但这个MyProxySelector
简直就是带两个属性,
一个是Host一个是Port,主要是写死了ip和端口,如果从环境变量读加分析,也差不多了,总之这方案是不完整的,原因有几个 :
-
什么时候接入proxy?还要外部去写条件,如果这个条件变化了呢?也是跟着变化
-
每次都接入,还是只接入一次,只接入一次的判定依据是什么
-
如果代理需要鉴权,如何实现?
尽管如此,在没有鉴权,仅限内部使用,然后根据生产环境或调试环境来决定是否接入代理,然后每次都接入也没什么毛病。这样, 这个方案用用是没有问题的。
4 问题来了
在上文需要正向代理的网络部署环境下,我们使用了阿里云的短信服务,其提供了一个SDK,大概就是基于HTTP请求,封装实现包括签名等各种签权方法,还有endpoint的管理啦。很可惜socket connect timout
,摆明就是网络不通了。
然后,我们addProxy,不通!为什么我们的URL连接一加Proxy就OK?直觉告诉我阿里SDK有没有自己代理的设定要求呢,要去了解这一块, 查看文档或SDK什么的。
我先关切一下我起手的亲爱的addProxy,为什么它的影响力不是全局,我想试一下它的范围,于是我需要1个web工程,1个jar包, 在web工程里造3处调用,1处是调本工程内接口加代理addProxy,2处是调本工程内接口不加代理,3处是调外部jar包不加代理。
工程是用springboot main构建的,非常方便,文件树是:
├─main
│ ├─java
│ │ └─de
│ │ └─tao
│ │ └─proxytest
│ │ │ Starter.java
│ │ │
│ │ ├─controller
│ │ │ TestController1.java
│ │ │ TestController2.java
│ │ │ TestController3.java
│ │ │
│ │ └─http
│ │ HttpUtils1.java
│ │ HttpUtils2.java
│ │ MyProxySelector.java
│ │
│ ├─resources
│ │ application-dev.yml
│ │ application-pro.yml
│ │ application.yml
│ │
│ └─webapp
│ index.html
核心配置是:
pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>de.tao</groupId>
<artifactId>proxytest</artifactId>
<version>1.0-SNAPSHOT</version>
<!--<packaging>war</packaging>-->
<properties>
<failOnMissingWebXml>false</failOnMissingWebXml>
<start-class>de.tao.proxytest.Starter</start-class>
</properties>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.0.0.RELEASE</version>
<relativePath />
</parent>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>de.tao</groupId>
<artifactId>proxytestinvoke</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin </artifactId>
<executions>
<execution>
<!--https://blog.csdn.net/ya2dan/article/details/50786464-->
<!--打包War命令: mvn compile package -Dmaven.test.skip=true -X -->
<goals>
<goal>repackage</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
其它类是:
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class Starter extends org.springframework.boot.web.servlet.support.SpringBootServletInitializer{
public static void main(String[] args) {
SpringApplication.run(Starter.class, args);
}
}
import de.tao.proxytest.http.HttpUtils1;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.HttpMethod;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class TestController1 {
@Autowired private HttpUtils1 utils1;
@GetMapping(value="/url1")
public String access1() {
String s = utils1.req(HttpMethod.GET, "https://www.baidu.com", null);
String result = "返回的内容为:";
if (s != null && s.length() > 0) {
result += s.replace("<!","");
}
return result;
}
}
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Component;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.*;
import java.util.Iterator;
import java.util.List;
import java.util.Properties;
@Component
public class HttpUtils1 {
final private org.slf4j.Logger log = LoggerFactory.getLogger(this.getClass());
@Value("${spring.profiles.active}")
private String profile;
public String req(org.springframework.http.HttpMethod method, String surl, String reqContent) {
String result = "";
BufferedReader reader = null;
try {
//if ("pro".equals(profile)) {
addProxy();
//}
URL url = new URL(surl);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod(method.name());
conn.setConnectTimeout(1000);
//放在正文里
conn.setDoOutput(true);
conn.setDoInput(true);
//POST不能使用缓存
conn.setUseCaches(false);
conn.setRequestProperty("Connection", "Keep-Alive");
conn.setRequestProperty("Charset", "UTF-8");
// 设置文件类型:
//conn.setRequestProperty("Content-Type","multipart/form-data");
//conn.addRequestProperty("Content-Type", "multipart/form-data; boundary=--taoych");
// 设置接收类型否则返回415错误
conn.setRequestProperty("accept","*/*");//此处为暴力方法设置接受所有类型,以此来防范返回415;
//conn.setRequestProperty("accept","application/json");
if (reqContent != null) {
// 往服务器里面发送数据
byte[] writebytes = reqContent.getBytes();
// 设置文件长度
conn.setRequestProperty("Content-Length", String.valueOf(writebytes.length));
OutputStream outwritestream = conn.getOutputStream();
outwritestream.write(writebytes);
outwritestream.flush();
outwritestream.close();
}
conn.connect();
int code = conn.getResponseCode();
if (code == 200) {
reader = new BufferedReader(
new InputStreamReader(conn.getInputStream()));
result = reader.readLine();
}
log.debug("!req done, code: {}, result: {}", code, result);
} catch (Exception e) {
e.printStackTrace();
return null;
} finally {
if (reader != null) {
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return result;
}
}
外部jar包工程更简单,其文件结构如下:
├─main
│ ├─java
│ │ └─com
│ │ └─aliyun
│ │ HttpUtils2.java
配置如下:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>de.tao</groupId>
<artifactId>proxytestinvoke</artifactId>
<version>1.0-SNAPSHOT</version>
</project>
只有一个类,内容跟HttpUtils1是一样的。
将web jar包和java jar拷贝到内网云主机,运行web jar包, java -jar xx.jar
,在公网云主机上运行以下curl命令,结果如下:
curl http://172.26.3.142:8082/proxytest-1.0-SNAPSHOT/url2
返回的内容为:
curl http://172.26.3.142:8082/proxytest-1.0-SNAPSHOT/url1
返回的内容为:DOCTYPE html>
curl http://172.26.3.142:8082/proxytest-1.0-SNAPSHOT/url2
返回的内容为:DOCTYPE html>
curl http://172.26.3.142:8082/proxytest-1.0-SNAPSHOT/url3
返回的内容为:DOCTYPE html>
可见,system properties的影响是全局的,包括外部jar包,其实他们都在一个虚拟机进程内,所以才有前面的想当然的推论。那么,为什么 阿里的SDK对这个全局影响视而不见呢?这只能去看阿里的SDK了。
5 阿里的SDK的代理写法
阿里sdk的调用就这几行代码:
IClientProfile profile = DefaultProfile.getProfile("cn-hangzhou", accessKeyId, accessKeySecret);
DefaultProfile.addEndpoint("cn-hangzhou", product, domain);
IAcsClient acsClient = new DefaultAcsClient(profile);
根据错误异常:
以下代码都是aliyun-java-sdk-core 4.2.3这个包内的,就是这个方法里com.aliyuncs.http.clients .CompatibleUrlConnClient#syncInvoke
出现了异常,它的代码是这样的:
@Override
public HttpResponse syncInvoke(HttpRequest request) throws IOException {
OutputStream out = null;
InputStream content = null;
HttpResponse response = null;
HttpURLConnection httpConn = buildHttpConnection(request);
try {
httpConn.connect();
if (null != request.getHttpContent() && request.getHttpContent().length > 0) {
out = httpConn.getOutputStream();
out.write(request.getHttpContent());
}
content = httpConn.getInputStream();
response = new HttpResponse(httpConn.getURL().toString());
parseHttpConn(response, httpConn, content);
return response;
} catch (IOException e) {
content = httpConn.getErrorStream();
response = new HttpResponse(httpConn.getURL().toString());
parseHttpConn(response, httpConn, content);
return response;
} finally {
if (content != null) { content.close(); }
httpConn.disconnect();
}
}
看到buildHttpConnection
基本上就差不多了,进去一看果然,这些代码就是问题的关键了:
HttpURLConnection httpConn = null;
if (url.getProtocol().equalsIgnoreCase("https")) {
if (sslSocketFactory != null) {
Proxy proxy = getProxy("HTTPS_PROXY", request);
HttpsURLConnection httpsConn = (HttpsURLConnection)url.openConnection(proxy);
httpsConn.setSSLSocketFactory(sslSocketFactory);
httpConn = httpsConn;
}
}
if (httpConn == null) {
Proxy proxy = getProxy("HTTP_PROXY", request);
httpConn = (HttpURLConnection)url.openConnection(proxy);
}
最直接关联的代码就是下面这个:
private Proxy getProxy(String env, HttpRequest request) throws MalformedURLException, UnsupportedEncodingException {
Proxy proxy = Proxy.NO_PROXY;
String httpProxy = System.getenv(env);
if (httpProxy != null) {
URL proxyUrl = new URL(httpProxy);
String userInfo = proxyUrl.getUserInfo();
if (userInfo != null) {
byte[] bytes = userInfo.getBytes("UTF-8");
String auth = DatatypeConverter.printBase64Binary(bytes);
request.putHeaderParameter("Proxy-Authorization", "Basic " + auth);
}
String hostname = proxyUrl.getHost();
int port = proxyUrl.getPort();
if (port == -1) {
port = proxyUrl.getDefaultPort();
}
SocketAddress addr = new InetSocketAddress(hostname, port);
proxy = new Proxy(Proxy.Type.HTTP, addr);
}
return proxy;
}
可见,它使用的是环境变量HTTP_PROXY
和HTTPS_PROXY
的值,注意大小写。而且它解决了授权的问题,这是代理的标准搞法啊。
6 再回过头来解决问题
那试一下吧,亲爱的addProxy工程里读环境变量HTTP_PROXY
的值会是啥?明明我在/etc/profile
里设置了http_proxy
,看出来了,是小写!
加上这段代码就能看到结果了:
String httpProxy = System.getenv("HTTP_PROXY");
System.out.println("system proxy env: " + httpProxy);
httpProxy = System.getenv("http_proxy");
System.out.println("system proxy lowercase env: " + httpProxy);
结果是:
system proxy env: null
system proxy lowercase env: http://172.26.3.141:3128/
果然,读不到环境变量值,那么我设置一下吧,把两个大写的修改到profile文件,并且source命令加载,当前xshell会窗口已经有了,再跑测试程序打印出值了。
然后,重启tomcat程序,发现还是不行!再把小写删除,不行!什么鬼,在启tomcat的xshell会话窗口里用env
命令看看,怎么env里没有大写的HTTP_PROXY设置呢?source一下就有了。这样看来,xshell
会话窗口的source并不会传导到其它会话窗口。
就这样,增加一个环境变量即问题解决了,问题的测试定位花费了主要的时间。
7 或许我们的代理方式可以改进了
待续